hide all comments

DSM

Greenfield: base your DSM language on your code

October 12, 2007 01:22:12 +0300 (EEST)

Jack Greenfield has a new article on Software Factories up at Methods and Tools. Maybe I'm tired, but I found it kind of heavy going. The most interesting thing was related to our recent discussion, Which is primary, problem or solution: whether to base modeling languages on the problem domain or solution domain. Jezz Santos had said to use the solution domain, which was a surprise to me -- and to other people like Peter Bell, Antoine Savelkoul, and Juha-Pekka.

The good news for Jezz is that he's not alone: even at the very top of the Microsoft Software Factories hierarchy, Jack Greenfield says to base things on the code:

In order to support the definition and differentiation of individual factories, the methodology starts by identifying and classifying architectural styles for frequently encountered families of solutions, such as web portals, smart clients and connected systems. It then captures information about each solution family, such as the artifacts needed to build its members, the technologies used and their placement within the architecture, required technology configurations, and best practices across the life cycle, including key decisions points and trade-offs.

This isn't just a question of different ways to skin the same cat, and each preferring the way they're familiar with. Antoine and Juha-Pekka even have articles to prove that they've tried doing it both ways: Building a DSL for an existing Framework and Creating a Domain-Specific Modeling Language for an Existing Framework (no prizes for originality of titles!). Yet both came down firmly on the side of starting from the problem domain, as seen from Antoine's blog and Juha-Pekka's article from 2004. The article examined 23 real-world applications of DSM to see how the modeling language concepts had been identified. It found four sources or approaches:

  1. Domain expert's concepts
  2. Generation output (i.e. the solution domain)
  3. Look and feel of the resulting system
  4. Variability analysis

More importantly, it also analysed where each approach had been effective -- or not. For approach 2, basing the DSM langauge on the solution artifacts and constructs, it found three situations:

In approach 2, generation output, there were significant differences between those cases whose generation output was itself an established domain-specific language, and those where the output was a generic language or an ad hoc [language] or format such as a configuration file.
  1. Those cases worked best where the output was an established domain-specific language, because the domain was more mature and the company in question already had a mature implementation framework, either their own or from a third party. In both [such] cases, the companies wanted their own additions to the languages, further improving the domain specificity.
  2. When the output is in a generic programming language, it would often be better to apply an approach other than generation output, to truly raise the level of abstraction.
  3. When the output is to an immature format, it would often be better to analyze the domain further to improve its understanding and the output format, rather than build a direct mapping to the existing shaky foundation.

Jack does hint at more specialized factories than the earlier very generic "horizontal" factories he has tended to talk about. But the level of abstraction is not being raised much above how people currently code:

Known good patterns and practices in the target domain are harvested, refined and encoded into domain specific tools and runtimes. The tools are then used to rapidly select, adapt, configure, complete, assemble and generate solutions that use the runtimes.

Let's do a thought experiment. Imagine yourself back in time to when applications were built with assembly language. Which of the words that I've italicized above would indicate a radical shift upwards in the level of abstraction? E.g. if you can select among some existing chunks of assembly language -- nice maybe, but you're still working on the same level: you've not moved to third generation languages yet. Only the final verb, generate, accomplishes that change.

In the same way, insofar as Jack's article is a good description of Software Factories, it looks like their emphasis is more on small percentage improvements of existing ways of building software. That's a shame, especially given that earlier they seemed more focused on the DSL and generation elements that they share with Domain-Specific Modeling. The $64,000 question is: why this change of emphasis? Jack, Jezz, Prashant Sridharan and other MS people have all made comments along the lines that doing real problem-domain-based DSM has proven too hard for them. Why are they failing, when so many others are succeeding? For examples of success, just take a look at the articles from the upcoming 7th OOPSLA workshop on Domain-Specific Modeling (e.g. 24, 14 and 10 are all graphical DSL examples).