Q&A for DSM Webinar, 17 November 2009

Thank you for joining the webinar. We had a very active audience and thus could not answer all the questions during the 20min Q&A session. While hoping many of the main issues were handled please find below replies to your specific question in a bit more detail.

Q: What are the knowledge curve challenges for a development team that changes from traditional development practices to a MDD practices? 
A: The biggest challenge is often related to organizational change as there will be two roles: creating the language and generators and using the language. Therefore the "knowledge curve challenge" question can be dived into two parts too.

1) Those who define the language and generators for a particular family of systems need to have experience of building systems similar to those in the family - so they know what kind of code needs to be generated, how that code integrates with libraries, and can follow an appropriate architecture, programming model etc. The good news is that these people are already experienced, so if you use good tools to define the language and generators, only a small team is needed. In our experience, the size of the language and generator development team usually ranges from 1 to 3 people depending on the domain.

2) For those who use the language to solve customer problems there should be very little new to learn. This is simply because the modeling language directly uses the familiar concepts of the problem domain, so there is no need to learn new terms or semantics. Using code generators also means that a large portion, if not all, of the implementation details can be hidden from the modelers. For this reason some of the modeling - and hence development - can be done directly by domain experts rather than developers. The insurance case demonstrated in the talk illustrated that nicely: the insurance specialists could make models in the new language, and the corresponding code was automatically generated to run in the J2EE portal.

Q: How can we avoid the "Too Specific Generator" problem? Is there any suggestion/tip to check in the definition of the generator that can help to define the right level of detail? If the question is yes... Can you tell us an example? 
A: Not absolutely sure what this question means, but usually generators should be built like any other program you care about, so modularize it, refactor, make it easy to maintain etc. If the question deals with the generated output, then it is a good principle - at least in the beginning - to generate code that closely mirrors what you write manually today. This makes it easier to introduce the use of generators and shows that you still have the same code as you had earlier. It also reduces risk: even if you decide not to carry on with DSM, you lose nothing: you have a lot of nice consistent generated code, that fits right in alongside your existing code, and is perfectly documented in easily understandable models.
Q: How to add .Net code generators to MetaEdit+? 
A: Basically all generators, regardless of generator tool, output strings - streams of text - so there is as such nothing special in generating .Net. To put it simply, you just write the generator to output the strings from the model interspersed with your .Net language's keywords and syntax.
Q: Does your workbench work well with taking an existing language and adding higher level constructs to that language?
A: Sure. If you have an existing graphical language, you can keep it and add some new higher level constructs, either by editing the existing language or subtyping it. You can also ask yourself whether you want the whole of the lower level existing language to be visible still: an analogy with programming languages would be C++, which added OO constructs to C. Sometimes it might be better to keep things pure and make a new language with the higher level concepts, and generate directly from there. You can also generate the system in the lower level language, which would probably be the best ideas if that language was a textual DSL.
Q: My understanding is that domain specific modeling or domain specific language is NOT a language in the sense of a programming language but it is a framework with high level objects and methods,etc. specific to a particular domain such as healthcare. Isn't it? Then using that framework you can depict your own healthcare application in a UML model as well, isn't it? 
A: A domain-specific (modeling) language really is a language; that's what sets it apart from a library or a framework. UML and generic programming languages can of course be used to build the same applications, and frameworks can be used or not in all cases. UML and generic languages do not include the rules of the domain, and so cannot ensure that the application is modeled correctly. They don't raise the level of abstraction, and don't make modeling easier by using concepts familiar from the problem domain: for example. UML has no concept that is specific to healthcare applications. A DSM language for healthcare might have a concept "Diagnosis", interlinked with concepts "Fact" and "Test". In a DSM model you could then have a diagram with a couple of Facts, e.g. "Overweight" and "Age>60", a Test "Blood sugar high", and all three pointing via a relationship to a possible Diagnosis "Diabetes".
Q: Do you consider a tool like MetaEdit+ a DSL for building DSLs? 
A: Yes, that is a good way to look this. Certainly we have in MetaEdit+ a graphical language to define modeling languages, but also in-built form based tools to specify constraints, notational symbols and generators along with the basic language structures specified in the metamodel.
Q: What are the challenges for debugging, model level or code level? 
A: A sample of the model level debugging was demonstrated during the webinar and you can see a 3 min demo at: http://www.metacase.com/webcasts/DSM_AnimationAPI.html. The idea is having trace links in the generated code to animate models while running the code in a simulator or debugging in an IDE.
Q: Hi Steven, you mentioned MDA as an approach to build models (CIM, PIM and PSM) in the UML language. Your conclusion was that it doesn't provide much advantage since the UML is a generic modeling language. Obviously you can build an MDA facility using models in described in Domain Specific Languages. What is your opinion about this?
A: It is true that some MDA proponents envisage higher forms of MDA incorporating elements of DSM. In these, the base UML can be extended with domain-specific enhancements, or even replaced with new MOF-based metamodels. However, experiences with the former have found current tools lacking the necessary extensibility, and no tools support the latter, largely because of fundamental problems and omissions in MOF itself.

MOF describes concepts of the language and how models of those concepts are to be stored and interchanged. A MOF description of a language describes little about those aspects that are of direct interest to its user: what models in the language actually look like, or how the user interacts with them.

The MDA approach also strongly emphasizes model-to-model transformations, from CIM-to-PIM-to another PIM or to PSMů. until finally to some code. We have not really seen cases of this working as such model-to-model transformations do not work well in practice: MDA suggests that you edit the output of generators, whether it is models or text, before moving to the next stage, and that gives rise to all the familiar problems of "round-trip" engineering. On the other hand if we leave out the model-to-model part of MDA and generate directly from PIM to code then the idea is basically the same: a single model-to-text transformation. However, with DSM you do not need to limit the languages to UML-like ones.

Q: Do people often end up building separate languages and then later trying to integrate them? Are there any best/worst practices you've seen when merging languages?
A:  Unfortunately there was not enough time to address language integration issues, but it is obviously an important topic as usually we need multiple languages. The answer unfortunately depends a lot on the type of tools you are using. Some tools for example can't support several languages or reuse model elements between languages. For example in a tool we developed before MetaEdit+ we could open only one file containing just one language and its models. Unfortunately in the real world we need to integrate the models of various languages, reuse elements between models etc. So, when we started to develop MetaEdit+ a core idea was to support integration and reuse on all the levels. Language engineers, while making metamodels, could reuse language concepts among the languages as well as define explicit integration ways among them. Also at the modeling level we though it should be natural to allow modelers to reuse data among models based on different languages as well as integrate models. By directly reusing or referencing existing model elements, we save modelers from having to do manual 'find and replace' if the name of an element in one model is changed. In other tools, basically in all the other workbenches, there is no such functionality available at the moment. These tools keep the models separately, storing them individually in files in XML or similar, and then adding references among the files and their elements. This is something of a pain when modeling one system on your own, but rapidly becomes a nightmare when you start to version those models or work with multiple developers.

What we have seen most doing is start with one language, perhaps for one subdomain only. Then extend the language or define a new language for handling a larger part of the domain. If you create a new language you can define into the integrated metamodel how these two languages are integrated.

Q: What is the most time consuming part in the DSL approach (modeling , code generation)? 
A: When looking at the whole software development project it is obviously the modeling. The creation of modeling languages and generators only takes a short time and requires only a small team, and then those languages and generators can be used by multiple developers for multiple applications and versions.

If we look at just the language and generator development phase, then as we saw in the slides in the webinar, making the code generator usually takes more time than making the language. That depends of course on the domain as if the target language to be generated is simple, let's say XML, then making the generator can be faster.

Q: Is MetaEdit+ is a tool that enables one to define a DSM Language, a Code Generator and Framework Code to build apps for a given domain.
A: MetaEdit+ is a tool for defining languages and code generators that work with your framework, library etc.
Q: I don't think that it's difficult to build application in a DSL-like way if you stay at the right abstraction levels. Just keep the top abstraction level closer to the domain and it'll be a cinch. Isn't it? 
A: Exactly!
Q: Can your workbench aid in parsing a larger textual programming notation? 
A: Yes, you may parse text with the generator system (MERL language).
Q: How do you evaluate the DSLs? did you use some kind of metrics, besides productivity (right level of abstraction,...) to determine success of your 76 cases. 
A: The evaluation of the 76 cases is described in detail in our article published in IEEE Software this summer. The article is available at: http://www.metacase.com/papers/WorstPracticesForDomain-SpecificModeling.html
Q: Why can't we use simple UML to create separate abstraction levels and high level domain objects and use the same to implement our application? Why a separate modeling language?
A: The original idea of MDA approach is to use UML for CIM level, UML for PIM level, and UML for PSM level. However, what UML best describes in terms of well-defined semantics and suitability for code generation is actually the code level classes that form the lowest level of abstraction in the MDA stack of models. Using the mobile phone example demonstrated in the beginning, the ideal high level concepts would not be any of the UML concepts (or coding concepts in general) but concepts specific in this case for mobile phone applications. These include concepts like Query, sending SMS, making call etc. These are the kind of high-level concepts that are applied in a domain-specific modeling language. Another way to answer this question would be trying to define a simple UML to deliver the same kind of modeling support than the demonstrated DSM provided. While that could be done in theory all the domain rules that would be needed to add to UML (as naming conventions or profiles) would be simply too time consuming to define and most of all, too difficult for modelers to learn. There would be hundreds of rules that the modeler should remember when using the enhanced UML. The rules would be like: give 'use the <<field>> stereotype for the attribute definition when specifying a form field. The <<field>> stereotype is applicable only if its class has <<form>> stereotype. Attributes with the <<field>> stereotype can only have data types of Boolean and String, except if the model also includes a component called Map, in which case they can also be of data type GPSLocation, and so on.
Q: Do you see many cases of breaking the problem down into multiple DSMs to reduce complexity? 
A: Yes, and most often companies use several languages that are integrated. How the whole problem domain is broken to be handled between different languages depends on the case. One way to look at this is the existing set of roles and tasks in the development process: one group specifies the UI, and a second group defines the behavior; you might therefore end up with one language for each group, or then have the UI group make a basic model and have the behavior group fill out its details. Another way would be to have different languages for different levels of detail, such as one language to specify the hardware architecture, another the software architecture, and a third to specify individual features in the system.
Q: I think sticking to the right abstraction level is the real issue, Isn't it?
A: Yes.