XML & DocBook: Structured Technical Documentation Authoring

Reading time: 2 – 3 minutes


XML is short for Extended Markup Language and is a subset of SGML, the Standard Generalized Markup Language. XML is an HTML-like formatting language. Whereas most HTML-related formats developed in the past adopted the “be conservative in what you send and liberal in what you receive” attitude, XML takes the opposite approach–documents should be 100% compatible. This compatibility is known as “well-formedness” of an XML document. To this end, even when the goal is clear, a document is rejected if it does not follow XML specifications to the fullest extent. In terms of practicality, this approach guarantees interoperability in the long run. Unlike HTML, which is the standard groupname for a lot of sub-protocols that are slightly different and not fully interoperable with one another, the strictness of XML ensures compatibility. XML also improves security dramatically, because there is only one way to interpret expressions, a way on which everybody agrees.

DocBook is an XML Document Type Definition or DTD. It is a subset of XML particularly suited for but not limited to the creation of books and papers about computer hardware and software. DocBook is well-known in the Linux community and is used by many publishing companies and open-source development projects. Most tools are developed for the DocBook DTD and are included in most Linux distributions. This allows for sending raw data that can be processed at the receiver’s end–wherever applications able to interpret XML directly are available.

The important thing to keep in mind is XML and DocBook let authors focus on content. In that sense, the presence of the word “markup” in the definition of XML is misleading. With XML, authors specify what type of data they are including, such as text explanations, command names, tables or images. How the content is formatted, laid out and displayed should not be their concern. From a single source, the receiver might generate PDF, PS, plain text, HTML and many other representations of the content.

Another advantage is DocBook XML files are written in plain text. Although many editors are available, such as oXygen and XMLmind, advanced DocBook users easily can write the source texts using vim, Emacs or any other text editor.

Publicat al Linux Journal: XML & DocBook: Structured Technical Documentation Authoring