Trees, Forests, Vines.
My current clients are using Model Driven Development for a new product line. Obviously I'm not able to say what the product line is, but that doesn't matter for the work I have to do, which to do with issues they are having working with model consistency between distributed teams on three sites. It actually is quite close in spirit to a project on hypertext repositories for multi-view engineering models I started and couldn't get continued funding for when I was a research associate at York.
They have a large UML/SysML and Simulink model which they are trying to maintain traceablitily. The UML model is created in Enterprise Architect and shared between teams using XMI export and a ClearCase repository. They have modelled their requirements in EA, and are tracing these system requirements to requirements in each sub-system, and from these local requirements to the implementation. They aren't using separate domain and implementation models as is practiced in some UML styles.
EA has a rather awkward XMI import mechanism when it comes to links between elements in different packages - when a package is imported, all links where the client end is in the imported package are erased, and all links defined in the XMI are created. Links are only created if both the client and supplier end exist in the model. This means that unless an inter-package link is recorded in the XMI at both ends, a link created to a new element might not get imported correctly, and so disappear from the XMI next time it is committed to ClearCase.
There are work-arounds for this, but basically the problem comes down to a mis-match between a hierarchical version control system, where users work on a package and the contents of a package, and a hyperlinked model, where links can exist between different nodes in the hierarchy, and really belong to neither end.
Once you also introduce baselines and viewpoints into such an environment, you get to a state where either you have to restrict the types of links - most UML links are designed so the client is dependent on the supplier, so the semantics are compatibily with the link being recorded in the client's package only - and also order the updates - the supplier must exist in the model before the client is loaded into the model for the link to be created. This makes it harder to update models, and for peer teams to push out baselines - you have to update the models for each subsystem team in an order consistent with any dependencies between the subsystems.
The difficulty in ordering baselines between is mitigated by designing to an interface, as it reduces the dependencies between subsystems, but it does not eliminate them. High integrity systems also have common effects which create dependencies between subsystems which cannot be designed around using interfaces (thermal characteristics, electromagnetic interference, power use, etc), and a model without these has lower predictive value. One technique to get around these dependencies is to apportion a budget for them rather than calculating them, which pushes them up towards the root of the tree. The other is to create a dependent model which is traced to the design model but represents the EM view or the thermal view of the system. So in addition to having vines between the branches of the design tree, there can be a forest of trees which model different aspects of the system.
Having looked at the capabilities - and been a bit hopeful about 'model views' - EA doesn't seem to have anything to support multiple viewpoints on the model. You can't create a second tree based on a different viewpoint, and there isn't a navigation via backward «trace» dependency ( you can create a second tree of packages which trace to the original tree, but navigation to the original requires clicking through dialogs ). EA also doesn't create synthetic nesting or dependency relationships between packages in the tree or packages whose elements depend on each other, which are useful if you have more than one hierarchy in the system. Multiple hierarchies arise when you are dividing a system in different ways - for example a structural view is divided into zones, and a functional view into sub-functions, and sub-systems view into sub-systems.
I have a strong feeling that a distributed model based on XMPP or ATOM protocols between nodes for each team should be good, but that doesn't replace the requirement to externalise the model for baselining or backup using industry standard version control tools or the issues with import into existing UML tools. There is also a distinct difference in view between the idea of 'the model' and 'a model' - having moved awy from CASE systems to distributed C4I systems for a couple of years, there are techniques for working with uncertain data and distributed databases which might be interesting to pursue, but are not going to sell to most engineering organisations large enough to need them - if you don't control the situation, then you use distributed models based on the best information available at a time, and then corrolating information from teams, rather than partitioning a system into a rigid hierarchy and trying to manage updates to that hierarchy. In such a distributed model, different viewpoints do not conflict - there is no one 'true' hierarchy of the system, only whatever hierarchy is useful for the team using that viewpoint - and assertions are made about an object which the owners of the object then choose to confirm or reject as authentic.
Last time I looked at RDF is was still disappointing when it comes to baselining parts of a model - although most triple stores have now got some kind of context or graph ID, there isn't 'native' support for versioning or saying graph X contains Y at version Z, and instead the movement seems to be towards having a graph ID per triple, which doesn't seem very manageable to apply version control to - the model has a few thousand packages, each with a few classes, each with a few members, each of which would require several triples to describe, so would be of the order of one or two million triples to include in each baseline. Baselining on a graph dump would be possible - just dump all the triples in the baseline out to a file and put that in source control - but that moves the mechanics out of the RDF metadata. Doing that with XMI and source control means there is nothing other that diff'ing the XMI files between versions to say what has changed; part of wanting to move to a distributed graph format is to get a difference mechanism which understands that the model is a graph rather than a text file.
Labels: distributed, uml