MetaCase Homepage
Forum Home Forum Home > > MetaEdit+
  New Posts New Posts RSS Feed - Thousands of Objects in a Graph
  FAQ FAQ  Forum Search   Events   Register Register  Login Login

Thousands of Objects in a Graph

 Post Reply Post Reply Page  123>
Author
Message
sap630 View Drop Down
Contributor
Contributor


Joined: 22.Sep.2009
Points: 10
Post Options Post Options   Thanks (0) Thanks(0)   Quote sap630 Quote  Post ReplyReply Direct Link To This Post Topic: Thousands of Objects in a Graph
    Posted: 22.Sep.2009 at 08:39
Supposing we have 5000 objects called "object_type_1" linked to 10000 objects called "object_type_2" in one graph (picture one very large ERD where object_type_1 is entities, and object_type_2 are attributes). What options are available to view all this information section by section (i.e. small subsets)?

Is it possible to link relationships across seperate graphs instances?

If I now create a new object_type_1 object in the graph (5001 object_type_1's now exist), and I need to Connect it to specific object_type_2 objects that already exist; is there a way to filter the 10000 objects based on their name property?
Back to Top
stevek View Drop Down
MetaCase
MetaCase
Avatar

Joined: 11.Mar.2008
Points: 643
Post Options Post Options   Thanks (0) Thanks(0)   Quote stevek Quote  Post ReplyReply Direct Link To This Post Posted: 22.Sep.2009 at 11:50
Technically, you could have one graph (conceptual information, abstract syntax) with several diagrams (representational information, concrete syntax). The conceptual graph has 15000 objects, but each diagram only shows say 50 of them. In each diagram you'd have visible the objects you want in that diagram, plus any others from other diagrams that you want to see relationships to. You can use the right-hand column of the Graph Browser in the main MetaEdit+ window to filter the objects based on their name property and type, and copy and paste the desired object_type_2 from there.
 
A better approach would be to decompose your graph of 15000 objects into sensible units. Studies of human cognition show that we simply can't work well with such a large number all in one graph, even if we filter or use views. But if we break it down into subgraphs/modules, each of which can be considered at a higher level as its own unit, we can cope fine. You'll probably end up with 3 levels of graphs: 1 top-level graph with 20 "module" objects, each of which decomposes to its own graph with 20 "module" objects, each of which decomposes to a normal graph with 10-15 object_type_1 and 20-30 object_type_2 instances. An alternative would be 4 levels, with 7-8 module objects rather than 20 - or then a combination, 3 levels deep in some places and 4 in others.
 
Aim for modules with high cohesion (the objects in there make sense together and interact with each other) and low coupling (minimum number of relationships across subgraph boundaries). To make a relationship from an object A in one subgraph to an object B in a different subgraph, you can simply reuse object B in the first subgraph. Alternatively, you could have a new object type object_type_2_ref, with a single property that points to an object_type_2. In either case, you can open the Info... dialog of the object_type_2, to find out in which graph it is defined.
 
With 15000 objects you probably also need to think about integrating the work of multiple users. The multi-user repository of MetaEdit+ makes this easy and transparent, so all your users can work together - much simpler than trying to merge and reconcile multiple independent edits with an old-fashioned textual version control system.
Back to Top
sap630 View Drop Down
Contributor
Contributor


Joined: 22.Sep.2009
Points: 10
Post Options Post Options   Thanks (0) Thanks(0)   Quote sap630 Quote  Post ReplyReply Direct Link To This Post Posted: 23.Sep.2009 at 11:56
Thanks Steve.

Any chance you have a MXT file describing the CWM (Common Warehouse Metamodel) specification?

Also, any XSLT to convert CWM models into MXM files?
Back to Top
stevek View Drop Down
MetaCase
MetaCase
Avatar

Joined: 11.Mar.2008
Points: 643
Post Options Post Options   Thanks (0) Thanks(0)   Quote stevek Quote  Post ReplyReply Direct Link To This Post Posted: 23.Sep.2009 at 13:18
Sorry, we don't have an MXT file for CWM. The CWM XSchema is over 1MB, 74 packages, 470 classes. As with all XSchemas, it's massively underspecified for use as a metamodel, so there's no way to automatically make a good MetaEdit+ metamodel for it. Instead, you need to understand what they intended, and make your own decisions about what makes a good modeling language for human use, as opposed to just being able to store the data.
 
Experience shows that a MetaEdit+ metamodel to contain the same information as an OMG XSchema is much smaller and easier to understand - much of the bloating of OMG schemas is due to the unsuitability of MOF to describe metamodels and of XMI to store them.
 
If you don't need full CWM compatibility, but just to import an existing data set, I'd suggest making your own metamodel based on the needs of your domain. You can then build a naive text-to-model transformation that is able to read just what is in your existing data set, and build the MXM file you want. That's a couple of orders of magnitude faster than trying to make a full, bulletproof XSLT and MXT for CWM. And remember that even if you had the full versions, the chances of being able to import correctly from all other tools that claim CWM support are slim indeed (cf. XMI for UML).
Back to Top
sap630 View Drop Down
Contributor
Contributor


Joined: 22.Sep.2009
Points: 10
Post Options Post Options   Thanks (0) Thanks(0)   Quote sap630 Quote  Post ReplyReply Direct Link To This Post Posted: 15.Oct.2009 at 23:55
Is it possible for one meta-meta-model enforce the design of a meta-model, which enforces the design of a model?

e.g. The ERD in Examples.
Back to Top
stevek View Drop Down
MetaCase
MetaCase
Avatar

Joined: 11.Mar.2008
Points: 643
Post Options Post Options   Thanks (0) Thanks(0)   Quote stevek Quote  Post ReplyReply Direct Link To This Post Posted: 16.Oct.2009 at 00:13
I'm not sure I understand your question, but I'll try and answer.
 
The meta-metamodel in MetaEdit+ is GOPPRR, which is fixed. You can use the metamodeling tools to define your own metamodel, as we did when building the ER metamodel. Having done that, you can use the modeling tools to build your own models, as we did when building the example ER diagram "Orders and Products". The model conforms to the metamodel, and the metamodel conforms to the meta-metamodel.
 
If you are envisaging multiple layers of people who can "enforce the design" of the next level down, that can work well too. You don't need extra meta-levels, though. For instance, you can make a base metamodel and give that to a few other metamodelers, each of whom can make extensions to it (in accordance with your instructions of what they are allowed to add, change, subtype etc.). Each extended metamodel can be given to a group of modelers, who can make models that will conform to that extended metamodel - and also to your base metamodel (insofar as your instructions require).
 
We also have customers who partly automate the process of extending a metamodel, to make sure that the middle level of metamodelers only follow the top level's instructions, or simply to make it easier for the middle level.
 
As we did in the graphical GOPRR modeling language, you can also build a modeling language whose domain is "modeling languages", and which generates the MetaEdit+ metamodel XML import format, MXT files. By drawing a model and pressing the Generate button, you can thus create a metamodel. Your metamodeling language could be similar to GOPRR or completely different: the only requirement is that it generates valid MXT files.
Back to Top
sap630 View Drop Down
Contributor
Contributor


Joined: 22.Sep.2009
Points: 10
Post Options Post Options   Thanks (0) Thanks(0)   Quote sap630 Quote  Post ReplyReply Direct Link To This Post Posted: 31.Oct.2009 at 15:02
Originally posted by stevek stevek wrote:

As we did in the graphical GOPRR modeling language, you can also build a modeling language whose domain is "modeling languages", and which generates the MetaEdit+ metamodel XML import format, MXT files. By drawing a model and pressingĀ the GenerateĀ button, you can thus create a metamodel. Your metamodeling language could be similar to GOPRR or completely different: the only requirement is that it generates valid MXT files.


Fascinating; dunno how I missed that manual. The Family Tree example is all about metamodeling the concept of a family trees. but using the individual tools (graph tool, object tool, etc). I had no idea that the GOPPR project along with the link you gave me would make meta-metamodelling easier!

It didn't click that Figure 1-3 from the Evaluation tutorial:

can actually be used in the GOPPR project, of which you can then Export and Build.

Here is a broad description of our current process in a typical data warehouse environment:
  1. Collect Raw Data Into Transactional Database (OLTP) flat files
  2. Extract-Transform-Load to 3NF (third normal form) Into First RDBMS
  3. Extract-Transform-Load to a Star Schema design Into Second RDBMS


Database design in steps 2 and 3 is done visually (MDA development) with automatic code generation. Extract-Transform-Loads (ETL) is also done visually with automatic code generation using another tool.

The current tools are powerful in what they do best. Our biggest problem, however, is that all our metadata is scattered everywhere, so we are investigating the use of a central repository such as Apache Jackrabbit (or the commercial version called CRX).

In addition to having a central metadata repository, we would need:
  1. To describe the structure of this repository in perhaps MetaEdit+, however, our OLTP metadata alone consists of thousands of M1 objects in a MOF-like layering scheme.
  2. To include the ETL and RDMS metadata into the repository in such a way that clear lineage is maintained from OLTP->RDBMS via ETL processes (e.g. if you modify an attribute of an OLTP object property, regenerating the two RDBMS tables will be fine unless some major structural changes are done, similar to the Family Tree example when adding concepts of Male/Female).
  3. Extend the metamodel (M2 layer) of the central repository in perhaps MetaEdit+ so that we can do ad-hoc visualization and code generation of objects and structures that no other tool describes (for example, designing star schema's may be automated later on if we can extend the database tool metadata to include the concept of attributes of attributes).
  4. Versioning and access control on an object-by-object basis (i.e. certain M1/M2/M3 objects are given individual access rights to users). Jackrabbit/CRX is close, but still under investigation.


As such, I am trying to determine if MetaEdit+ would be an ideal tool for:
  • Metamodeling (M2 layer) and extension in adhoc ways

  • Modeling (M1 layer) of thousands of objects

  • Our Central Metadata Repository Itself


Also, wondering if there are any plans to open up the MetaEdit+ repository into a more JCR (Jackrabbit/CRX) like repository?
Back to Top
stevek View Drop Down
MetaCase
MetaCase
Avatar

Joined: 11.Mar.2008
Points: 643
Post Options Post Options   Thanks (0) Thanks(0)   Quote stevek Quote  Post ReplyReply Direct Link To This Post Posted: 02.Nov.2009 at 13:38
So what you need is a modeling language for describing database schema and ETL transformations. It will have concepts like Table, Column, and various ETL operations, e.g. Split to split a string value based on a separator character.
 
You can then build models of your databases and transformations, e.g. in the first RDBMS is a Table "Employee" that has a Column "Name"; in the second RDBMS is a table "Personnel" that has Columns "First Name" and "Last Name"; and between them is an ETL transformation that uses a Split from "Name" with the first part (role) going to "First Name" and the second to "Last Name":
 
                              /----first---> "First Name"
"Name" ----- "Split on: space"
                              \---second---> "Last Name"
 
Hopefully it is obvious this is a simplification! The main thing I wanted to make clear is how the various meta-levels would work. That will stay the same whether you have one column or 50 000, and whether you have a simple modeling language or a complex one. You can extend the modeling language as you go, as you mention in point 3 above.
 
By modeling all this in MetaEdit+, you can solve the problems you currently face by having several tools. If you change "Last Name" to "Surname", you don't need to find the places in both your ER tool and your ETL tool where that is referenced: you just change it once in the model in MetaEdit+, and that change is visible in both the schema models and the ETL models. I'd probably have separate Graph types for schema modeling and transformation modeling, with the Column type used by both. The schemas define the Column objects; the transformations use them. You can have multiple users working in the same MetaEdit+ repository - some building schemas, some building transformations, maybe someone extending the modeling language. Versioning and locking happen at the level of objects, so you can work together without the tool getting in the way.
 
You can write generators to check the things you need to ensure, e.g. that every column in the target database is mentioned on the RHS of some ETL rule, and that the ETL rules only reference columns that are actually in the respective tables. You can make the warnings from those checks show up when you want, e.g. only when doing a build, or instantly in the diagram if a modeler tries to connect an illegal column (e.g. makes the mapping backwards).
 
You can also write generators to produce the SQL that creates the schemas in your first and second databases, and the ETL script (whatever format your ETL engine needs). If you prefer, you can export models so your existing ER and ETL tools can open them (exporting to XML and transforming with XSLT, or writing a generator to create a text format readable by the tools, or using the MetaEdit+ API to access the model data directly).
 
As for Jackrabbit: to be honest, I don't think just having all the models (i.e. your "metadata") in one content repository is enough of a solution. That just puts it in one place; you can do that by putting your current ER and ETL files on the same hard disk! The big question isn't where it is, but are there tools to access it and know what it means (i.e. that a Column is defined in the schema and used in the ETL). With MetaEdit+, you get the repository and the tooling.
Back to Top
Luc View Drop Down
Member
Member


Joined: 05.Nov.2009
Location: Australia
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote Luc Quote  Post ReplyReply Direct Link To This Post Posted: 05.Nov.2009 at 03:40
Steve - our application has the equivalent of thousands of database tables, tens of thousands columns, two thousands screens/web pages, tens of thousands of fields, etc.  Starting from any one of these objects we generate source code for our application.  For example, starting from a database table definition we generate DDL and database access methods.  From a screen definition we generate validation rules for each updatable field on the screen.
 
These objects are linked to each other, thus a screen field is linked, via an intermediate abstract object we call "element", to database columns. (Maybe we could have modelled our metadata differently - however that is what we have now).  The object type "element" contains the specifications of each data item, whether a database column, screen field, attribute in an xml message/document.  These specifications contain the usual data type information, plus lists of domain values and their representation on different media - for example for an element called "maritalStatus" a domain value would be "single", and its representation might be "SIN" for storage on the database, "Single" for display on a web page or screen, and its xmlName might be "mst".
 
Now, suppose we have a separate graph for each database table, and a separate one for each screen or web page, since these graphs all connect via the element objects, how do we avoid replicating the information from the elements into the two types of graphs?
 
Further, in another view we might want to draw an ER diagram of the tables.  It is true that it is not useful to graw an ER diagram containing thousands of tables, so it does make sense to group the tables somehow.  However, this does not remove the need for a table in one group to be linked to another table in a different group.  So, how would we deal with the situation where we need to draw a model of that particular link and neighbouring tables?
 
Thus, we have hundreds of thousands of metadata objects interlinked in various ways.  We were wondering if it would be possible for a user to select an object, say a particular database table, and for metaEdit+ to draw a particular type of graph relating to the table, its columns and the elements?  And, in another use-case, for metaedit+ to draw a different graph, eg an ER diagram centred on that table, but containing adjacent tables, perhaps one or two or three steps away? 
 
Thanks   
Back to Top
stevek View Drop Down
MetaCase
MetaCase
Avatar

Joined: 11.Mar.2008
Points: 643
Post Options Post Options   Thanks (0) Thanks(0)   Quote stevek Quote  Post ReplyReply Direct Link To This Post Posted: 05.Nov.2009 at 13:10
Luc - having the "maritalStatus" element in many graphs, even of different types, is no problem for MetaEdit+. In fact, it's what MetaEdit+ does best! The same object can be in many graphs: it's not a copy or duplicate, it's the same identical object. Graphs point to their objects, rather than strongly containing them: many graphs can point to the same object.
 
Similarly, objects can point to other objects. This allows you to create references, so you don't need to directly include a 'foreign' object in a graph, but can have a different type of object directly in the graph, and have that object point to the 'foreign' object. In your ER diagram example, that's one way of modularizing your database: try and keep most links between tables internal to the module containing those tables, but allow links to tables in other modules via these reference objects. Of course nothing stops you from directly including the elements from outside the group if you want; it's just often easier for modelers to understand if you make explicit which links are considered internal and which external.
 
As you say, the exact choice of how to model and link screen fields with database tables is an open question. At the small/new/simple end of the scale, people make the screen fields primary: they just want to model the UI and have the database automatically generated. At the large/legacy/complex end of the scale, people make the database primary: the schema exists and is largely fixed, and when you create a UI field you need to link it to some existing database column. Somewhere in that scale is the best solution for your needs, and we'd be happy to help you find that.
 
As to generating graphs on the fly, that's certainly possible. The MetaEdit+ generators can produce new graphs in MetaEdit+'s Model XML format, and import those for the modeler to see. Reading a new graph however is hard on the human brain, a bit like a map to an unfamiliar town, or even worse a familiar area where all the towns have been rearranged into different positions. As far as possible I'd thus aim to make the existing models naturally answer the questions the modelers are likely to want to ask - that's largely a question of creating the right metamodel, e.g. one with extra concepts to better cope with questions of large scale (compared to ER diagrams).
 
Tooling helps here too: e.g. you can select any object in a graph and ask for its Info, which will show you all the other graphs where it is used and allow you to jump directly to that object in those graphs. Another nice feature of MetaEdit+ is the generation of reports that are linked to objects: e.g. you could create reports that would show the information that the modeler would want, and he can then double-click the text of the desired object in the report output and jump straight to that object in the model. These features are really useful when you want to explore a large model.
Back to Top
 Post Reply Post Reply Page  123>

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 12.05
Copyright ©2001-2022 Web Wiz Ltd.

This page was generated in 0.060 seconds.