show all comments

DSM

Worst Practices for Domain-Specific Modeling

October 12, 2009 16:52:41 +0300 (EEST)

One of the surprises for me at Code Generation 2009 was during the keynote, when I passed round a list for people to sign up to receive a pre-print of the Worst Practices for Domain-Specific Modeling article that was to appear in IEEE Software. When the list came back, I was astonished to see that basically the entire audience had signed up -- never underestimate the appeal of "free"!

I think the article is important, in that it is the first study of a large sample of DSM languages. The 20 worst practices identified were analysed from a sample spanning:

  • 76 cases of DSM
  • 15 years
  • 4 continents
  • several tools
  • 100 language creators
  • 3 to 300 modelers per case

IEEE Software has now published the next issue after the special issue on DSM, so it seems a fair time to point you to the final version of our article, available for free download:

Worst Practices for Domain-Specific Modeling
Steven Kelly and Risto Pohjonen
IEEE Software, vol. 26, no. 4, pp. 22-29, July/Aug. 2009
doi:10.1109/MS.2009.109

Stop press: thanks to the efforts of the tireless Yoshio Asano of Fujisetsubi, the article is also available in Japanese from the same page. Domo arigato, Yoshio-san!

DSM

Using UML takes 15% longer than just coding

October 07, 2009 16:20:51 +0300 (EEST)

In the keynote at Code Generation, I mentioned that empirical research shows that using UML does not improve software development productivity: depending on the study, reports ranged from -15% to +10% compared to just coding. I guess most people these days know those results from their own experience, but as the reports I was aware of were from the 1990s, it was interesting to see a more up-to-date article recently:

WJ Dzidek, E Arisholm, LC Briand: A Realistic Empirical Evaluation of the Costs and Benefits of UML in Software Maintenance, IEEE Transactions on Software Engineering, Vol 34 No 3, May/June 2008

Unlike many earlier studies, this uses professional developers and reasonably large tasks. The tasks all extended the same Java Struts web application, in total about 30 hours per developer. 10 developers performed the tasks with Java, and another 10 performed the same tasks with Java and Borland's Together UML tool. The developers using UML were somewhat more experienced -- 256 kLOC of Java under their belts rather than 187 kLOC, and 44% longer Struts experience -- but otherwise the groups were similar. Time was measured until submission of a correct solution, giving a reasonably sound basis for comparison. Here are the results:

Time to correctly complete 5 tasks: with UML 2300 minutes, without UML 2000 minutes

Compared to just coding, using UML took 15% longer to reach a correct solution (the green bar). In addition, it looks like even using UML to help you understand the code gives no benefit over just reading the code: the blue and red bars are the same length as the purple bar. As the tasks only looked at extending an existing system with existing models, we can't say for sure whether the story is the same in initial implementation, but other studies indicate it.

One bad thing about the article is that it tries to obfuscate this clear result by subtracting the time spent on updating the models: the whole times are there, but the abstract, intro and conclusions concentrate on the doctored numbers, trying to show that UML is no slower. Worse, the authors try to give the impression that the results without UML contained more errors -- although they clearly state that they measured the time to a correct submission. They claim a "54% increase in functional correctness", which sounded impressive. However, alarm bells started ringing when I saw the actual data even shows a 100% increase in correctness for one task. That would mean all the UML solutions were totally correct, and all the non-UML solutions were totally wrong, wouldn't it? But not in their world: what it actually meant was that out of 10 non-UML developers, all their submissions were correct apart from one mistake made by one developer in an early submission, but which he later corrected. Since none of the UML developers made a mistake in their initial submissions of that particular task, they calculated a 100% difference, and try to claim that as a 100% improvement in correctness -- ludicrous!

To calculate correctness they should really have had a number of things that had to be correct, e.g. 20 function points. Calculated like that, the value for 1 mistake would drop by a factor of 20, down from 100% to just 5% for that developer, and 0.5% over all non-UML developers. I'm pretty sure that calculated like that there would be no statistically significant difference left. Even if there was, times were measured until all mistakes were corrected, so all it would mean is that the non-UML developers were more likely to submit a code change for testing before it was completely correct. Quite possibly the extra 15% of time spent on updating the models gave the developer time to notice a mistake, perhaps when updating that part of the model, and so he went straight back to making a fix rather than first submitting his code for testing. In any case, to reach the same eventual level of quality took 15% longer with UML than without: if you have a quality standard to meet, using UML won't make you get there any more certainly, it will just slow you down.

To their credit, the authors point out two similar experiments as related work. One showed UML took 27% longer, the other 48% longer. The percentage of time spent updating models was also larger: 30-35% (which may be because those studies only measured time until the first submission of a solution: correcting bugs was probably mostly coding, so if measured to a correct solution the UML time would only increase a little and hence the percentages would drop).

So what do we learn from all this? Probably nothing new about UML, but at least a confirmation that earlier results still apply, even for real developers on realistic projects using today's UML tools. Maybe more importantly, we can see that empirical research, properly written up, is valuable in helping us decide whether something really improves productivity or not. Ignore the conclusions (they probably existed in the minds of the authors before the paper was written), but look at the data and the analysis. Throw out the chaff, and draw your own conclusions from what is left. Above all, don't blindly accept or reject what they say, just because it agrees or disagrees with your existing prejudice. There's at least a chance that you might learn something!

DSM-tech

MERL primer for openArchitectureWare Xpand users

August 31, 2009 12:00:08 +0300 (EEST)

MERL and openArchitectureWare's Xpand languages are rather similar in approach and functionality. The main differences are in syntax and keywords, so learning one if you know the other is easy. In the MetaEdit+ forum I've posted a quick primer on how to translate from Xpand to MERL. The primer is below, with comments after the respective row where necessary.

Xpand MERL
SomeFixedText 'SomeFixedText'
«SOMECOMMAND» SOMECOMMAND
In XPand commands are quoted, whereas in MERL fixed text is quoted.
«DEFINE foo ... ENDDEFINE» Report 'foo' ... EndReport
In MERL, each report is defined on its own, not with several reports one after another in a single text.
Each report is defined on a particular Graph type (or the supertype of all, Graph itself).
«FILE Name + ".java" ... ENDFILE» filename :Name '.java' write ... close
«EXPAND foo FOREACH Bar» foreach .Bar { subreport 'foo' run }
You don't need to define a subreport, you can just put the commands from foo in the {}.
This often makes more sense, e.g. if you're not going to call foo from elsewhere.
 
foreach is for navigation from a graph to its elements.
For navigation between elements use "do" or "dowhile" (which covers SEPARATOR):
«EXPAND foo FOREACH Bar» do .Bar { subreport 'foo' run }
«EXPAND foo FOREACH Bar SEPARATOR ","» dowhile .Bar { subreport 'foo' run ','}
«this.name» :name
«LET ... AS var ..<var>.. ENDLET» variable 'var' write ... close ..$var..
For assignments with a single element on the right-hand side, you can use the shorter form: $var = 'foo', $var = :foo etc.
«REM...ENDREM» /*...*/
«PROTECT ID ... ... ENDPROTECT» md5id ... md5block ... md5sum
«CSTART ... CEND ...» filename 'bar.java'
md5start ... md5stop ...
merge .. .. .. close
The start and end sequences are specified in the filename command, since they will be the same for the whole file
«this.name.toUpper()» :Name%upper
For text manipulation, e.g. with Java in oAW, you can use MERL translators . Many are defined in _translators, such as %upper a-z A-Z, and you can define your own to convert any combination of characters, strings and regular expressions.

DSM-tech

Re: Processing of MetaEdit Models with oAW

August 28, 2009 14:49:21 +0300 (EEST)

Heiko Kern has written a great set of information on how to process MetaEdit+ models with oAW (the openArchitectureWare model transformation tools for Eclipse). The integration he's built is a great example of how easy it is to integrate MetaEdit+ with other tools:

  • You can export models or metamodels from MetaEdit+ as XML. The format is an extension of the open Graph eXchange Language standard, GXL, supported by over 40 tools.
  • You can quickly write a little generator to output your models in whatever XML or other text format you want.
  • MetaEdit+ can call other tools from its generators, e.g. for build integration.
  • Other tools can call MetaEdit+ with command-line parameters to specify a series of actions to run.
  • Other tools can call MetaEdit+ through its WebServices / SOAP API, to create/read/update/delete any data in models, and for control integration, e.g. to animate models for model-level debugging.
  • You can import models or metamodels as XML.
  • You can import text in any format and convert it to models via reverse engineering generators.

At last year's OOPSLA DSM workshop, Heiko had an article about his MetaEdit+ / Eclipse integration. We had a good discussion about it, in particular about his reasons for building it. His paper gave the impression that he wanted to use oAW rather than MetaEdit+'s own MERL generator language, because he needed some specific features in oAW. It turned out though that he hadn't actually used MERL, and didn't realise that MERL and oAW's XPand are actually very similar in terms of approach and functionality.

MERL tends to be a little more succint: here is the MERL generator to output simple Java classes for a UML Class Diagram, as in Heiko's example:

subreport '_translators' run

foreach .Class [UML]
{  filename id '.java' write
      'public class ' id ' {' $cr2

      do :Attributes
      {  '   ' :Visibility ' ' :Data type; ' ' :Name ';' $cr2

         '   public void set' :Name%firstUpper '(' :Data type; ' ' :Name ') {' $cr
         '      this.' :Name ' = ' :Name ';' $cr
         '   }' $cr2

         '   public ' :Data type; ' get' :Name%firstUpper '() {' $cr
         '      return this.' :Name ';' $cr
         '   }' $cr2
      }
      '}'
   close 
}

Heiko's oAW XPand code is 65% longer. Even ignoring the extra loop over all Class Diagrams that Heiko needs (MetaEdit+ offers that automatically in the UI or via the forAll:run: command-line parameter), oAW is still over 20% longer. The actual difference isn't that important: I'm sure both could be made shorter for this example, but the current code is typical of what is generally written. My point is that there's no real saving to be had by using XPand instead of MERL. If your models are in MetaEdit+, use MERL; if they're in Eclipse, use oAW. Having integration is great, but if you can avoid using it then that's even better.

DSM

Code Generation 2009 round-up

June 24, 2009 01:25:20 +0300 (EEST)

Once again, Code Generation proved itself as the best European conference on Model-Driven Development. Lots of smart people, lots of experience, lots of enthusiasm, lots of willingness to listen and learn from others. Even though having to prepare and run some sessions hampered me from seeing as much of the rest as I'd like, there's still too much to write for one blog post. I'll post about things I'm certain of first, and come back to things like Xtext and MPS after further investigation.

Keynotes

The two keynotes, both presented as a double act by me and Markus Völter, seemed to go down well. Mark Dalgarno had a surprise up his sleeve, presenting us with a blind choice of weapons from a black bag. We then had to duel it out, graphical DSM against textual DSLs, with the plastic gun and dagger we picked. Since I got the gun, I think the result was a foregone conclusion :-). The dagger may be a "weapon from an earlier, more civilized age", but it's only useful if you can get in close to your adversary. Similarly, text may be more familiar, but it does often tie you closer to the code; problem domain DSLs in text seem as rare as accurate knife throwers. Markus successfully stabbed me in the back later on, so that evened things up and emphasized the point from our slides: both text and graphics are useful in the right place. Choose, but choose wisely.

It was fun to see the keynote get picked up on Twitter:

EelcoVisser: keynote by @markusvoelter and Steven Kelly at #cg2009: great overview of issues in model-driven development
HBehrens: Steven Kelly at #cg2009 keynote: "wizard based generators create a large legacy application you've never seen before"

The latter was picked up by several people. The reference was to vendor-supplied wizards, often found in IDEs or SDKs, that create skeleton applications for you based on your input. Since the vendors take pride in just how much boilerplate they can spew out, you're left with a mass of generated code that you've never seen before, but must extend with your own code. Worse, you're responsible for maintaining the whole ensuing mixture, and there's no chance of re-running the wizard to change some of the choices -- at least not without losing or invalidating the code you've added. That's in sharp contrast with generation in DSM, where your input is in the form of a model which you can edit at any time. You get the speed of generation but can remain at a high level of abstraction throughout.

MetaEdit+ Hands-on

We'd decided to try something special in the hands-on: building 5 different graphical modeling languages from scratch in under 3 hours. Rather than being random exercises, the languages were increasingly good ways of modeling the same domain. We started with something that was basically just the current code turned into graphics, and ended up with a language that reduced the modeling work to a third of what it was at its worst, with many possible errors ruled out by the language design and rules, and with much better scope for reuse. We showed how to make generators for all the languages, and actually built them for two. And of course since this was MetaEdit+, simply defining the metamodel already gave you a full graphical modeling environment -- we just tweaked the symbols to taste.

Never having run the session before, we were rather nervous about how much we could achieve in the time available. In the end, thanks to great slides from Risto Pohjonen and testing from Janne Luoma, it seems we pretty much hit our target. Only at the very end of the last language did we have some people only just starting the last section (the generator) while others were finishing it and going on to beautify the symbols or play around with other fun features of MetaEdit+. Hopefully people learned not just about MetaEdit+ as a tool, but also how to make better languages and improve existing ones. Feedback online was encouraging:

PeterBell: Great metaedit hands on - built and refactored language and generator in just a couple of hours at #cg2009
elsvene: been to a great hands-on session for MetaEdit+. Really interesting tool! #cg2009
HBehrens: for me MetaEdit is the most sophisticated graphical modeling tool currently available #cg2009. Thanks for this session!

Dinner

The conference dinner was of the high standard you'd expect from a Cambridge college. The airy hall and contemporary art lent a friendly ambience. The large round tables weren't particularly conducive to conversation: you could only really talk to the people either side of you without shouting or craning your neck. On long tables you can reach 5 people for the same effort. I was fortunate to be sitting between Scott Finnie and Jon Hurwitz, so I certainly didn't suffer.

The "suffering" started later, when there was a raffle in aid of Bletchley Park, the home of Allied code-breaking work in World War II. I ended up winning a prize donated by Microsoft: a screwdriver toolkit and MSDN T-shirt, causing much hilarity and bad jokes about finally getting Microsoft tools that didn't crash. The irony continued when Alan Cameron Wills won a signed copy of our Domain-Specific Modeling book -- despite having received one from us last year. Either the older British segment of the audience were most inclined to support Bletchley Park by buying raffle tickets, or then the draw was rigged to encourage vendor co-operation. The people on my table were having none of that, and encouraged me to cover up the Microsoft logos :-). All in all a good laugh, and in a good cause.

DSM-tech

Oslo Quadrant reviews

June 12, 2009 19:17:12 +0300 (EEST)

The May 26th CTP of Oslo includes the first public version of Quadrant, Microsoft's visual model editor. I've had my head down on other topics, so haven't had a chance to play with it yet, but here are some reviews from others.

Charles Young, Initial experiments

Microsoft is publically committed to providing strong UML and XMI support in 'Oslo' and this is our first glimpse of what they intend. ... My initial experiments with LoadUML suggest that the tool is not yet fully functional. For example, it fell over the use of the xmi:type attribute on the uml:Model element. It failed to handle a type element of an ownedAttribute, and it didn't recognise the packageImport element. The error messages were not always very helpful and the tool is slow...
Initial experiments with LoadAssembly went a little more smoothly. Again, the tool is very slow, and can take several minutes to complete imports...
This early version of Quadrant has big problems with big models. It could, in some cases, take several minutes of 100% CPU usage to display the contents of a folder. Memory usage can also grow to monumental proportions...
All in all, don’t expect Quadrant or the new loaders to behave very well. This is very early preview code.

Charles did manage to get an XMI file and .NET assembly imported after some messing around, so it wasn't all bad. But those speed and memory problems aren't going to go away just by optimising code: scalability is something that must be architected in from the start.

Frank Lillehagen, Quadrant - First Impressions (I had the pleasure of meeting Frank in May 2001, when he was VP at Computas and responsible for the Metis modeling tool - first released in 1991!)

Quadrant's user interface is novel, uniform, and functional, but a bit cumbersome, and as an early preview it exposes a lot of the underlying wiring, nuts and bolts. Some functionality is well supported, such as customizing views and interacting with large models in multiple workpads. On the other hand, services for e.g. relationship modeling are poor. ... Visualization is the focus, more than modeling.
The layout of diagrams is partially automated, however when you close and reopen a diagram, it will revert to an automatic layout, not keeping the location changes you made manually the last time.
The support for key visual modeling concepts like relationships is not native, and limited. Quadrant does not recognize many-to-many relationships from entities, leading to diagrams ... where [half] the shapes are really relationships that ... should be shown as links.

From the pictures Frank posted, the existing models in Oslo break many principles of good modeling design. Having automatic layout that loses your manual layout changes pretty much rules out the chance of getting to know your way around your models, for any diagram more complex than a simple tree. And having no n-ary relationships is going to mean unwelcome hacking for both metamodelers and modelers: many relationships are binary, but certainly not all.

I'll continue to follow the progress of Quadrant with interest, but there seems little point getting my hands dirty with it yet. It's a shame that it seems to be back to square one for modeling at Microsoft - this is like the early versions of DSL Tools, and you'd think they'd have moved on in the 5 years since we first saw that. When we did a complete rewrite of MetaEdit (released 1993) to get the first version of MetaEdit+ (1995), there was rather a lot more that worked, and the scalability was already in place. The UI wasn't pretty, so we'll give Quadrant the thumbs up on that score, but the real worth of an application like this lies between the UI and the database. If Quadrant only works for binary relationships, autolayout, and small models, there's some major rework needed before it becomes a serious contender. Let's hope their bosses give them chance to do it!

DSM

Getting ready for Code Generation

May 19, 2009 20:31:11 +0300 (EEST)

Markus Voelter and I are having fun at the moment preparing our keynotes for Code Generation. The descriptions on the web page are deliberately vague, but the important fact is there: we'll be giving both keynotes together.

As frequent conference attendees will know, Markus and I are both quiet, meek guys who would never presume to disagree, so the talks will most likely be boring consensus... NOT! I did suggest mud wrestling would be an easier way to settle our differences, but my imposing physical presence must have convinced Markus he'd have a better chance with PowerPoints at twenty paces.

In related news, Mark Dalgarno has finally realized that the concepts of "early bird" and "software developer" make uneasy bed-fellows, and the way to get people to sign up some reasonable time before conferences is to use the stick not the carrot. Yes, there's now a special not-very-early-bird price increase of 10% extra heading your way if you don't go to the site NOW and register.

It's not all stick though: if you were there at either previous conference, you get 5% off. Canny forward thinker that he is, and with CodeGen 2026 clearly in mind, Mark isn't offering 10% off if you were there both years (darn!).

However you cut it, Code Generation is simply the best conference on DSM in Europe. Even without the mud wrestling.

DSM

Playing with Martin Fowler's DSM language

March 18, 2009 13:08:40 +0200 (EET)

The roadmap for Martin Fowler's forthcoming book on DSLs indicates that he will focus on textual DSLs. The online draft of the intro does however briefly show a graphical language for a home security system: the model in Figure 6 is implemented with MetaEdit+, based on the original textual requirements:

Miss Grant has a secret compartment in her bedroom that is normally locked and concealed. To open it she has to close the door, open the second draw in her chest, turn her bedside light on - and then the secret panel is unlocked for her to open.

Juha-Pekka has been using Martin's example as a way of showing how to implement a DSM language in MetaEdit+ (Parts 1 and 2 ), and in Part 3 he points out some problems with the original language: too broad a focus, unclear usage process, and too low a level of abstraction. Juha-Pekka correctly suggests going back to the basics of the domain to discover the necessary language concepts, rather than trying to shoehorn this domain into a generic state model.

As an exercise, however, I thought it might be interesting to try to improve Martin's language as it is, rather than starting from scratch. How much of DSM is "you just have to know how to do it", and how much can be reduced to simple steps that anyone could apply? Obviously, the more of the latter that we can find, the easier it is for somebody to get started. Our DSM book aimed at just this kind of practical approach; let's take a few hints from there and apply them to Martin's language. We'll show the model in the current state of the language as we evolve it: click the pictures to see the full size screenshot.

Use meaningful symbols

Miss Grant's model with meaningful symbols Martin's language uses just black and white shapes, the kind you might see in a standard flow chart palette. Only the text within the shapes gives a clue as to the actual domain: words like "door", "drawer", "light" and "panel" occur many times. However, the brain takes a lot longer to find all occurrences of a word in a picture than it does to find all occurrences of a symbol. Try it yourself: how many times does the door symbol appear in the picture on the right, and how many times does the word "door" appear in Martin's Figure 6? (You'll actually notice a slight discrepancy: Martin's diagram omits the "reset all / return to start" event caused by the door opening, shown at the extra door at the top left in our diagram; he mentions this elsewhere in the draft.)

Reuse objects

Miss Grant's model with objects reusedHaving four door objects like this obviously isn't ideal: Martin had them, but the problem wasn't so visible there because they a) couldn't be distinguished from other objects, and b) didn't so clearly represent something in the physical world -- the problem domain. Now that we have them visible, it would be nicer if we could show that there's really only one door in this model, and it is involved in four different events or actions. So let's merge the four doors into one, and similarly for the panel.

The light bulbs and drawers are harder: if we merge them, we end up with either lots of crossing lines, or objects on top of each other -- ugly. Maybe there's something else we could do for them?

Consider n-ary relationships

N-ary relationships -- relationships involving more than two objects -- are everywhere: a "family" relationship links a father, mother and children; an inheritance relationship links a superclass with several subclasses. When people draw a diagram on paper, they're happy drawing lines that split. However, implementers of modeling tools have often misanalyzed the simplest and most common case of a binary relationship, and ended up thinking relationships can only connect two objects. They end up having to represent n-ary relationships with a fake object in place of the relationship. Such fake relationship objects leave the modeling language inconsistent, as the user can draw a "relationship object" on its own without connecting it to anything. They also make checking model correctness much harder, as the rule for what can be connected in a certain kind of relationship must be split over several relationships, all cobbled back together through the fake relationship object. (For more details, see Welke's article from my previous entry on The Model Repository.)

Miss Grant's model with several events allowed for a transitionIf you're lucky enough to have a tool that supports n-ary relationships properly, take a look at your modeling language and see if you can make a more complex structure of objects into a simpler one by connecting several objects with a single relationship. In this case, Martin already shows excellent taste by using n-ary relationships for transitions :-) -- but maybe we can go a bit further still. On the left path between Active and Unlocked panel we can see the sequence "Drawer opens", "Waiting for light", "Light on"; on the right path we have "Light on", "Waiting for drawer", "Drawer opens". If we go back to the original text, we can see that all this means is that we wait for the drawer to open and the light to come on: both must happen, but in either order. So why not just have a single transition with two events to trigger it? We can make that the default semantics of a transition: it waits for all attached events to happen, in any order. To remind ourselves that such transitions will wait for all events, we show a little block on them where the lines meet. If we wanted to support the case where either one event or the other could happen, but not both (XOR), we could have a property in the transition to specify whether it is AND or XOR, or then simply require the user to draw two transitions between the two states. In either case the amount of extra work for the XOR case is much less than is needed in a generic state model for the AND case, which requires the insertion of extra "Waiting for..." states -- only two here, but imagine covering all possibilities if there were 5 events that could happen in any order. (Exercise for the reader: how many "Waiting for" states would that require? Hint: we can do better than 5 factorial.)

Rule out corner cases

Top level model for Miss GrantAn important feature of good DSM languages is that they make the job of the modeler easier. In Martin's language, one of the hard things to see is in which states are the panels, doors etc. unlocked. We can see the actions that unlock them, but to know the state of a panel in a given state, we need to play through all possible routes that can get to that state. As the whole point of this language is to describe when things are locked or unlocked, this is quite a serious problem. Is there a way that we can make things clearer to the modeler? If we look at a few of these models, we see a common pattern emerge. On the right here a panel is unlocked by a state "Panel unlocked", and there is a transition from that state when the panel is closed, to a state that locks the panel again. This unlock->close->lock sequence appears in many models, and makes sense in the problem domain. So why not allow a shortcut syntax, where the panel itself plays the role of a state, as at the bottom in this picture: on entering the panel state, the panel is unlocked; we leave the panel state when the panel is closed, and on leaving it we lock the panel. Since we can specify the semantics like this, we can obviously make the generators produce the required code: we can do this by extra steps in the generator, by a model-to-model transformation that produces a more generic state machine, or by a more powerful state machine engine in the framework. The last would be my choice, as that way the code generated per model has the closest resemblance to the models, stays on a higher level of abstraction, and keeps the overall size of the application down.

Lower level model for Miss Grant

That takes care of the case where the panel unlock-close-lock sequence can be considered as an atomic element in the model, with nothing else happening during it. What about cases like the door being unlocked, during which there is a sequence of other events needed before it is again locked? In this case we can use a sub-model: in the figure above, the green padlock relationship connecting the "Door unlocked" state to the door means "during this state, the door is unlocked" -- i.e. on entry the door is unlocked, and on exit it is locked; as before, closing the door exits that state. The little blue star in the "Door unlocked" state indicates that it has a submodel, shown in this figure. The contents of the submodel are of course just the set of states during which the door is always unlocked. Now it's easy for the modeler to know whether the various secret compartments are locked or unlocked at each stage -- and of course thus to ensure that the system he designs has no holes in its security. And since the code is generated, we'll never forget those pesky bounds checks, so there'll be no buffer overruns to exploit :-).

Keep models compact

Miss Grant's model merged back into a single diagramSub-models are great for hiding complexity and making the modeling language scale better. If each model becomes too small, however, many people find it harder to understand. Those with a Lisp or Smalltalk background are used to methods being only 2-4 lines of code; in more commonly used languages several such methods tend to be grouped together into a single larger method. Obviously extremes in either direction are bad; providing we stay within the bounds of what is sensible, we can choose the option that the modeler feels more comfortable with. In this diagram we have combined the two small models back into one larger one, stretching the "Door unlocked" state to enclose its substates. We can still see during which states the door and panel are unlocked, and maybe the overall picture is clearer -- or maybe not. In any case, this slightly reduces the number of model elements compared to the previous step.

Metrics

If we count each object as 1 element, each binary relationship as 1 element, and each additional role or property as 1 element, plus 1 for each model, we get the following size metrics for the models above:

41Initial
41Use meaningful symbols
36Reuse objects
27Consider n-ary relationships
23Rule out corner cases (includes submodel)
19Keep models compact

As can be seen, we've reduced the size of the model by over 50%. Since the effort needed for a given project increases more than linearly with size, we can estimate that productivity increases compared to the original language by a factor of at least 2. Improving symbols and cutting out corner cases weren't aimed at reducing the size of the model, but will have significant improvements in usability, so I'd guess overall a factor of around 3 is reasonable. Note that this is on top of whatever improvement is gained by Martin in moving from a straight hand-coding solution to a DSL, and from a textual DSL to a graphical DSL. More importantly, though, these are the kinds of steps that anyone can see how to apply to their own modeling language, and any team of developers would benefit from at the modeling level. Interestingly, with MetaEdit+ all of these changes could be applied to the modeling language without throwing away the initial model: the initial model and all intermediate models remain valid throughout, as the language evolves.