Chapter 5. DocBook

AsciiDoc generates article, book and refentry DocBook documents (corresponding to the AsciiDoc article, book and manpage document types).

Most Linux distributions come with conversion tools (collectively called a toolchain) for converting DocBook files to presentation formats such as Postscript, HTML, PDF, EPUB, DVI, PostScript, LaTeX, roff (the native man page format), HTMLHelp, JavaHelp and text. There are also programs that allow you to view DocBook files directly, for example Yelp (the GNOME help viewer).

5.1. Converting DocBook to other file formats

DocBook files are validated, parsed and translated various presentation file formats using a combination of applications collectively called a DocBook tool chain. The function of a tool chain is to read the DocBook markup (produced by AsciiDoc) and transform it to a presentation format (for example HTML, PDF, HTML Help, EPUB, DVI, PostScript, LaTeX).

A wide range of user output format requirements coupled with a choice of available tools and stylesheets results in many valid tool chain combinations.

5.2. a2x Toolchain Wrapper

One of the biggest hurdles for new users is installing, configuring and using a DocBook XML toolchain. a2x(1) can help — it’s a toolchain wrapper command that will generate XHTML (chunked and unchunked), PDF, EPUB, DVI, PS, LaTeX, man page, HTML Help and text file outputs from an AsciiDoc text file. a2x(1) does all the grunt work associated with generating and sequencing the toolchain commands and managing intermediate and output files. a2x(1) also optionally deploys admonition and navigation icons and a CSS stylesheet. See the a2x(1) man page for more details. In addition to asciidoc(1) you also need xsltproc(1), DocBook XSL Stylesheets and optionally: dblatex or FOP (to generate PDF); w3m(1) or lynx(1) (to generate text).

The following examples generate doc/source-highlight-filter.pdf from the AsciiDoc doc/source-highlight-filter.txt source file. The first example uses dblatex(1) (the default PDF generator) the second example forces FOP to be used:

$ a2x -f pdf doc/source-highlight-filter.txt
$ a2x -f pdf --fop doc/source-highlight-filter.txt

See the a2x(1) man page for details.

[Tip]

Use the --verbose command-line option to view executed toolchain commands.

5.3. HTML generation

AsciiDoc produces nicely styled HTML directly without requiring a DocBook toolchain but there are also advantages in going the DocBook route:

  • HTML from DocBook can optionally include automatically generated indexes, tables of contents, footnotes, lists of figures and tables.
  • DocBook toolchains can also (optionally) generate separate (chunked) linked HTML pages for each document section.
  • Toolchain processing performs link and document validity checks.
  • If the DocBook lang attribute is set then things like table of contents, figure and table captions and admonition captions will be output in the specified language (setting the AsciiDoc lang attribute sets the DocBook lang attribute).

On the other hand, HTML output directly from AsciiDoc is much faster, is easily customized and can be used in situations where there is no suitable DocBook toolchain (for example, see the AsciiDoc website).

5.4. PDF generation

There are two commonly used tools to generate PDFs from DocBook, dblatex and FOP.

dblatex or FOP?

  • dblatex is easier to install, there’s zero configuration required and no Java VM to install — it just works out of the box.
  • dblatex source code highlighting and numbering is superb.
  • dblatex is easier to use as it converts DocBook directly to PDF whereas before using FOP you have to convert DocBook to XML-FO using DocBook XSL Stylesheets.
  • FOP is more feature complete (for example, callouts are processed inside literal layouts) and arguably produces nicer looking output.

5.5. HTML Help generation

  1. Convert DocBook XML documents to HTML Help compiler source files using DocBook XSL Stylesheets and xsltproc(1).
  2. Convert the HTML Help source (.hhp and .html) files to HTML Help (.chm) files using the Microsoft HTML Help Compiler.

5.6. Toolchain components summary

AsciiDoc
Converts AsciiDoc (.txt) files to DocBook XML (.xml) files.
DocBook XSL Stylesheets
These are a set of XSL stylesheets containing rules for converting DocBook XML documents to HTML, XSL-FO, manpage and HTML Help files. The stylesheets are used in conjunction with an XML parser such as xsltproc(1).
xsltproc
An XML parser for applying XSLT stylesheets (in our case the DocBook XSL Stylesheets) to XML documents.
dblatex
Generates PDF, DVI, PostScript and LaTeX formats directly from DocBook source via the intermediate LaTeX typesetting language —  uses DocBook XSL Stylesheets, xsltproc(1) and latex(1).
FOP
The Apache Formatting Objects Processor converts XSL-FO (.fo) files to PDF files. The XSL-FO files are generated from DocBook source files using DocBook XSL Stylesheets and xsltproc(1).
Microsoft Help Compiler
The Microsoft HTML Help Compiler (hhc.exe) is a command-line tool that converts HTML Help source files to a single HTML Help (.chm) file. It runs on MS Windows platforms and can be downloaded from http://www.microsoft.com.

5.7. AsciiDoc dblatex configuration files

The AsciiDoc distribution ./dblatex directory contains asciidoc-dblatex.xsl (customized XSL parameter settings) and asciidoc-dblatex.sty (customized LaTeX settings). These are examples of optional dblatex output customization and are used by a2x(1).

5.8. AsciiDoc DocBook XSL Stylesheets drivers

You will have noticed that the distributed HTML and HTML Help documentation files (for example ./doc/asciidoc.html) are not the plain outputs produced using the default DocBook XSL Stylesheets configuration. This is because they have been processed using customized DocBook XSL Stylesheets along with (in the case of HTML outputs) the custom ./stylesheets/docbook-xsl.css CSS stylesheet.

You’ll find the customized DocBook XSL drivers along with additional documentation in the distribution ./docbook-xsl directory. The examples that follow are executed from the distribution documentation (./doc) directory. These drivers are also used by a2x(1).

common.xsl
Shared driver parameters. This file is not used directly but is included in all the following drivers.
chunked.xsl

Generate chunked XHTML (separate HTML pages for each document section) in the ./doc/chunked directory. For example:

$ python ../asciidoc.py -b docbook asciidoc.txt
$ xsltproc --nonet ../docbook-xsl/chunked.xsl asciidoc.xml
epub.xsl
Used by a2x(1) to generate EPUB formatted documents.
fo.xsl

Generate XSL Formatting Object (.fo) files for subsequent PDF file generation using FOP. For example:

$ python ../asciidoc.py -b docbook article.txt
$ xsltproc --nonet ../docbook-xsl/fo.xsl article.xml > article.fo
$ fop article.fo article.pdf
htmlhelp.xsl

Generate Microsoft HTML Help source files for the MS HTML Help Compiler in the ./doc/htmlhelp directory. This example is run on MS Windows from a Cygwin shell prompt:

$ python ../asciidoc.py -b docbook asciidoc.txt
$ xsltproc --nonet ../docbook-xsl/htmlhelp.xsl asciidoc.xml
$ c:/Program\ Files/HTML\ Help\ Workshop/hhc.exe htmlhelp.hhp
manpage.xsl

Generate a roff(1) format UNIX man page from a DocBook XML refentry document. This example generates an asciidoc.1 man page file:

$ python ../asciidoc.py -d manpage -b docbook asciidoc.1.txt
$ xsltproc --nonet ../docbook-xsl/manpage.xsl asciidoc.1.xml
xhtml.xsl

Convert a DocBook XML file to a single XHTML file. For example:

$ python ../asciidoc.py -b docbook asciidoc.txt
$ xsltproc --nonet ../docbook-xsl/xhtml.xsl asciidoc.xml > asciidoc.html

If you want to see how the complete documentation set is processed take a look at the A-A-P script ./doc/main.aap.