Why RDFcat uses XHTML 2.0 for its documentation format

After some thought, I have settled on XHTML 2.0. Even though I have used DocBook for open source projects in the past (and in fact I am the author of docbook2X), I have decided against the use of DocBook.

Why not DocBook?

The preceding arguments against DocBook can be summed up as follows. Although DocBook has been retrofitted as an XML format, its basic design still follows the SGML principles from earlier times. If we want to use the arguably better documentation design that the World Wide Web has shown us, in the past ten years, then DocBook is probably not the right choice.

So which document format?

Having rejected the use of DocBook, the question is now whether we want to use a standard documentation format, and if so, which one?

Obviously, we want to make the documentation format as standardized as possible. I choose XHTML 2.0, for the following reasons.

And how to display it?

The reflex reaction of XML people like us to HTML documentation is probably to cringe. We would have to work around the browser tag soup mess. But this is 2006, not 1998. Many users are already using Mozilla or Internet Explorer 6. These browsers support XML delivered with a XSLT stylesheet that converts to XHTML for display. For other browsers, we use server-side XSLT, or preprocess the files statically beforehand. That is what this Web site does now4.

The other part of XSL is supposedly XSL-FO. We do not have XSL-FO yet, but this does not matter. In fact, CSS serves a similar purpose as XSL-FO, namely, to describe display presentation, and it ought to be used in place of XSL-FO for online browsing, not only because it is well-supported, but it also preserves the DOM. (The DOM of an XHTML document is meaningful, and you can extract at least some semantics from it, but not with the DOM of XSL-FO documents.)

What if I want to view the documentation offline?

Although I talked of using server-side scripts as a possibility — which would not be compatible with offline viewing without an HTTP server — the RDFcat documentation does not require them. Most of the documentation is generated statically, even on the Web site, with bits of XSLT and RDF.

Print output can be obtained by transforming to TeX or groff. (But is not done yet.)


  1. In fact, the author managed to implement the automatic XSL transformation for the XHTML 2.0 documents of this Web site in a few hours.
  2. Some XML gurus suggest using XML editors, but a lot of them are either immature or are proprietary software.
  3. This practice is actually quite annoying if you are happen to be the author of a stylesheet package and are looking for examples to test on.
  4. If you are reading this page with Mozilla Firefox, try viewing the page source!

$LastChangedDate: 2006-03-05 19:42:00 -0500 #$