Jackrabbit
Link: http://incubator.apache.org/jackrabbit/
I've had a long interest in CMS, mainly in the model->transformation->model view of refinement in engineering design, and the facilities that CMS + common, queriable formats + hypertext can give for ambient design capture.
Jackrabbit is an apache incubator project acting as the reference implementation of JSR-170, a fine grained repository API. I've been thinking about some form of XMI based versioned network database for some modelling tools I've been playing with, and there isn't anything much in terms of repositories that allow the update mechanisms, so this is interesting. Also, I've had half an ear to what's happening in grid data systems and heard some talk about RDF triple-store synchronization using RSS, though I don't yet see XML based traffic or JSR-170's API as quite becoming congruent with the shunting big grid datasets around the net - that's still the arena of binary. (Then there's tools like BitTorrent that do the shunting well, but at the expense of being a 'pure functional' API and not providing any update or synchronization). Each is trying to keep a distributed model consistant or make best use of network bandwidth; there's various levels of maturity in the technology from CVS to WebDAV to JSR-170 to data grid to XMI update, and I think something interesting may be about to happen if they can come together.
Can you embed XMI models in Atom and use pub/sub as the synchronization mechanism and leverage some of the existing infrastructure? Or do we flatten the models to RDF? I'd rather not lose XMI's advantages, and the RSS infrastructure doesn't seem to be leveraging RDF's relational properties, so it may just be useful as a packaging mechanism.
Does the lack of standardized 'modification counter' in JSR-170 make pub+sub difficult, or would the update mechansim be behind the API? You could just add a deep event listener for the whole repository that manages the mod counts, but I'm not sure that is quite elegant. The XMI update mechanism doesn't provide a modification count either, so it may be that points to it being at the transport level. The modification event listener adds increments to modification state of nodes as updated (or just logs the node UUID and a global increment), and so provides a query interface for list of nodes updated since a given global increment. The query returns a list of the nodes, UUID and path, maybe as Atom (which allows link, uuid, and inband and outband data), with logic to restrict query to sub-branch of repository. That should then allow clients then to request updates on said nodes. The mechanics of updating don't need to be part of the modified content stream, but could be, depending on bandwidth.
Need to find more time to play.
Oh, and the Fedora Core 3 installer finds the CD-ROM on my Sony just fine :)
TME