Building a Bridge: DITA, DocBook, and ODF
By Eric Armstrong on Sep 14, 2007
Some folks here are taking a very strong look at DITA. I'm certainly one of them. But we also have a huge legacy of documents in Solbook format (Sun's subset of DocBook). There are tools for editing such documents, and tools for processing them. and there are many people who are comfortable with those tools. So DITA isn't going to replace the world, just yet.
But DITA makes extensive reuse possible. It's a format with a serious future, because "reuse" is a very big deal. It lets you single-source your information content so have one place to make an edit. That sort of thing becomes important when you have multiple revisions of a product, and/or multiple variations. It becomes important when different tools and different products use the same information in different ways. It can drastically improve quality, ensure uniformity of presentation. Finally, structured formats like DITA and DocBook create the kind of consistently-tagged information that allows for useful automation.
So how do we bridge those two worlds? Fortunately, there are two sets of tools that make it possible:
- DITA/DocBook/ODF Transforms
- CMS Plugins
Flatirons Solutions currently has a set of transforms (the Document Interoperability Framework) that converts DITA to DocBook, and vice versa. They're available now. And they're working on a set that will convert those formats to the OASIS OpenOffice Data Format (ODF).
The ODF transforms are pretty interesting. They would make it possible to edit DITA or DocBook documents in OpenOffice--an open source suite of tools that is available to everyone. That's a far cry from the kind of money you have to spend to get a really good editor these days. (Those editors will still be needed for handling content references, at the very least. But it will be interesting to see what can be done using OpenOffice.
But it's the DITA/DocBook transforms that are of most interest for interchange with legacy systems and tools. (There is also the question of how they handle DITA content references and DocBook entity references. But that's one of the tricky details that a concept paper like this can skim over...)
The ability to convert one document format into another is good, but it creates a problem--dual-sourcing. If you extract a document from a repository and convert it to another format, you now have two copies. What happens when the original document changes? How do you find out? How do your edits make it back to the original? If there is no back-path for such changes, how do you ensure that copies are never modified?
The problems stem from dual-sourcing. The solution is to maintain single-sourcing. And that's where plugins come in.
I was reminded of that plugin capability when I saw it listed as a feature of the XDocs CMS. It was in an evaluation-table I put together quite a while ago. I didn't know what I could do with that feature at the time, but it seemed useful so I made a note of it.
The other morning, while pedaling to work, it came to me that the plugin feature could be used to create a bridge to an external repository.
XDocs is an XML repository that can store both DITA and DocBook, but let's say it's devoted to DITA work. And let's say that DocBook/Solbook documents are in a separate repository. A plugin can be written that accesses the external repository and applies the transforms to them, presenting them as DITA documents to the XDocs user.
These thoughts imply that plugin capability is a critical feature for any CMS that may eventually need to allow for document interchange.
A Two-Way Bridge
As long as the external repository has APIs that can be utilized, the plugin and transformation is possible. But things become even more interesting of the external repository has the capacity for plugins, as well. In that case, the plugins could talk to each other.
That's an ideal scenario, because a single set of interchange-APIs can be defined. (RESTful APIs, of course, based on the HTTP protocol.) Those APIs can then be used across a variety of repositories. Instead of writing a new plugin for every repository, you write one and give it the address you want to connect to. You might need multiple plugins, one each for multiple external repositories, but you only have one set of code .
An ODF Bridge?
The only remaining problem is how to make content in those repositories available to OpenOffice. I'm not quite sure how to do that, but maybe there is something I don't know about OpenOffice.
I'm used to using it to access the file system. But maybe there is some way to get it to access a repository? Or maybe it would be possible to write a plugin that makes a repository appear as though it were part of the file system?
That would be ideal, because then we would be back to having one plugin talk to another, doing the conversion necessary to make the external repository appear as though it were in the tool's native format.
- Flatirons Solutions: Document Interoperability Framework
- XDocs: One Cool CMS
- RESTful APIs