Chapter 8. Document Structure

An AsciiDoc document consists of a series of block elements starting with an optional document Header, followed by an optional Preamble, followed by zero or more document Sections.

Almost any combination of zero or more elements constitutes a valid AsciiDoc document: documents can range from a single sentence to a multi-part book.

8.1. Block Elements

Block elements consist of one or more lines of text and may contain other block elements.

The AsciiDoc block structure can be informally summarized as follows [1]:

Document      ::= (Header?,Preamble?,Section*)
Header        ::= (Title,(AuthorInfo,RevisionInfo?)?)
AuthorInfo    ::= (FirstName,(MiddleName?,LastName)?,EmailAddress?)
RevisionInfo  ::= (RevisionNumber?,RevisionDate,RevisionRemark?)
Preamble      ::= (SectionBody)
Section       ::= (Title,SectionBody?,(Section)*)
SectionBody   ::= ((BlockTitle?,Block)|BlockMacro)+
Block         ::= (Paragraph|DelimitedBlock|List|Table)
List          ::= (BulletedList|NumberedList|LabeledList|CalloutList)
BulletedList  ::= (ListItem)+
NumberedList  ::= (ListItem)+
CalloutList   ::= (ListItem)+
LabeledList   ::= (ListEntry)+
ListEntry     ::= (ListLabel,ListItem)
ListLabel     ::= (ListTerm+)
ListItem      ::= (ItemText,(List|ListParagraph|ListContinuation)*)

Where:

  • ? implies zero or one occurrence, + implies one or more occurrences, * implies zero or more occurrences.
  • All block elements are separated by line boundaries.
  • BlockId, AttributeEntry and AttributeList block elements (not shown) can occur almost anywhere.
  • There are a number of document type and backend specific restrictions imposed on the block syntax.
  • The following elements cannot contain blank lines: Header, Title, Paragraph, ItemText.
  • A ListParagraph is a Paragraph with its listelement option set.
  • A ListContinuation is a list continuation element.

8.2. Header

The Header contains document meta-data, typically title plus optional authorship and revision information:

  • The Header is optional, but if it is used it must start with a document title.
  • Optional Author and Revision information immediately follows the header title.
  • The document header must be separated from the remainder of the document by one or more blank lines and cannot contain blank lines.
  • The header can include comments.
  • The header can include attribute entries, typically doctype, lang, encoding, icons, data-uri, toc, numbered.
  • Header attributes are overridden by command-line attributes.
  • If the header contains non-UTF-8 characters then the encoding must precede the header (either in the document or on the command-line).

Here’s an example AsciiDoc document header:

Writing Documentation using AsciiDoc
====================================
Joe Bloggs <jbloggs@mymail.com>
v2.0, February 2003:
Rewritten for version 2 release.

The author information line contains the author’s name optionally followed by the author’s email address. The author’s name is formatted like:

firstname[ [middlename ]lastname][ <email>]]

i.e. a first name followed by optional middle and last names followed by an email address in that order. Multi-word first, middle and last names can be entered using the underscore as a word separator. The email address comes last and must be enclosed in angle <> brackets. Here a some examples of author information lines:

Joe Bloggs <jbloggs@mymail.com>
Joe Bloggs
Vincent Willem van_Gogh

If the author line does not match the above specification then the entire author line is treated as the first name.

The optional revision information line follows the author information line. The revision information can be one of two formats:

  1. An optional document revision number followed by an optional revision date followed by an optional revision remark:

    • If the revision number is specified it must be followed by a comma.
    • The revision number must contain at least one numeric character.
    • Any non-numeric characters preceding the first numeric character will be dropped.
    • If a revision remark is specified it must be preceded by a colon. The revision remark extends from the colon up to the next blank line, attribute entry or comment and is subject to normal text substitutions.
    • If a revision number or remark has been set but the revision date has not been set then the revision date is set to the value of the docdate attribute.

    Examples:

    v2.0, February 2003
    February 2003
    v2.0,
    v2.0, February 2003: Rewritten for version 2 release.
    February 2003: Rewritten for version 2 release.
    v2.0,: Rewritten for version 2 release.
    :Rewritten for version 2 release.
  2. The revision information line can also be an RCS/CVS/SVN $Id$ marker:

    • AsciiDoc extracts the revnumber, revdate, and author attributes from the $Id$ revision marker and displays them in the document header.
    • If an $Id$ revision marker is used the header author line can be omitted.

    Example:

    $Id: mydoc.txt,v 1.5 2009/05/17 17:58:44 jbloggs Exp $

You can override or set header parameters by passing revnumber, revremark, revdate, email, author, authorinitials, firstname and lastname attributes using the asciidoc(1) -a (--attribute) command-line option. For example:

$ asciidoc -a revdate=2004/07/27 article.txt

Attribute entries can also be added to the header for substitution in the header template with Attribute Entry elements.

The title element in HTML outputs is set to the AsciiDoc document title, you can set it to a different value by including a title attribute entry in the document header.

8.2.1. Additional document header information

AsciiDoc has two mechanisms for optionally including additional meta-data in the header of the output document:

docinfo configuration file sections
If a configuration file section named docinfo has been loaded then it will be included in the document header. Typically the docinfo section name will be prefixed with a + character so that it is appended to (rather than replace) other docinfo sections.
docinfo files
Two docinfo files are recognized: one named docinfo and a second named like the AsciiDoc source file with a -docinfo suffix. For example, if the source document is called mydoc.txt then the document information files would be docinfo.xml and mydoc-docinfo.xml (for DocBook outputs) and docinfo.html and mydoc-docinfo.html (for HTML outputs). The docinfo attributes control which docinfo files are included in the output files.

The contents docinfo templates and files is dependent on the type of output:

HTML
Valid head child elements. Typically style and script elements for CSS and JavaScript inclusion.
DocBook
Valid articleinfo or bookinfo child elements. DocBook defines numerous elements for document meta-data, for example: copyrights, document history and authorship information. See the DocBook ./doc/article-docinfo.xml example that comes with the AsciiDoc distribution. The rendering of meta-data elements (or not) is DocBook processor dependent.

8.3. Preamble

The Preamble is an optional untitled section body between the document Header and the first Section title.

8.4. Sections

In addition to the document title (level 0), AsciiDoc supports four section levels: 1 (top) to 4 (bottom). Section levels are delimited by section titles. Sections are translated using configuration file section markup templates. AsciiDoc generates the following intrinsic attributes specifically for use in section markup templates:

level
The level attribute is the section level number, it is normally just the title level number (1..4). However, if the leveloffset attribute is defined it will be added to the level attribute. The leveloffset attribute is useful for combining documents.
sectnum
The -n (--section-numbers) command-line option generates the sectnum (section number) attribute. The sectnum attribute is used for section numbers in HTML outputs (DocBook section numbering are handled automatically by the DocBook toolchain commands).

8.4.1. Section markup templates

Section markup templates specify output markup and are defined in AsciiDoc configuration files. Section markup template names are derived as follows (in order of precedence):

  1. From the title’s first positional attribute or template attribute. For example, the following three section titles are functionally equivalent:

    [[terms]]
    [glossary]
    List of Terms
    -------------
    
    ["glossary",id="terms"]
    List of Terms
    -------------
    
    [template="glossary",id="terms"]
    List of Terms
    -------------
  2. When the title text matches a configuration file [specialsections] entry.
  3. If neither of the above the default sect<level> template is used (where <level> is a number from 1 to 4).

In addition to the normal section template names (sect1, sect2, sect3, sect4) AsciiDoc has the following templates for frontmatter, backmatter and other special sections: abstract, preface, colophon, dedication, glossary, bibliography, synopsis, appendix, index. These special section templates generate the corresponding Docbook elements; for HTML outputs they default to the sect1 section template.

8.4.2. Section IDs

If no explicit section ID is specified an ID will be synthesised from the section title. The primary purpose of this feature is to ensure persistence of table of contents links (permalinks): the missing section IDs are generated dynamically by the JavaScript TOC generator after the page is loaded. If you link to a dynamically generated TOC address the page will load but the browser will ignore the (as yet ungenerated) section ID.

The IDs are generated by the following algorithm:

  • Replace all non-alphanumeric title characters with underscores.
  • Strip leading or trailing underscores.
  • Convert to lowercase.
  • Prepend the idprefix attribute (so there’s no possibility of name clashes with existing document IDs). Prepend an underscore if the idprefix attribute is not defined.
  • A numbered suffix (_2, _3 …) is added if a same named auto-generated section ID exists.
  • If the ascii-ids attribute is defined then non-ASCII characters are replaced with ASCII equivalents. This attribute should be should be avoided if possible as its sole purpose is to accommodate deficient downstream applications that cannot process non-ASCII ID attributes. If available, it will use the trans python module, otherwise it will fallback to using NFKD algorithm, which cannot handle all unicode characters. For example, Wstęp żółtej łąki will be translated to Wstep zoltej laki under trans and Wstep zotej aki under NFKD.

Example: the title Jim’s House would generate the ID _jim_s_house.

Section ID synthesis can be disabled by undefining the sectids attribute.

8.4.3. Special Section Titles

AsciiDoc has a mechanism for mapping predefined section titles auto-magically to specific markup templates. For example a title Appendix A: Code Reference will automatically use the appendix section markup template. The mappings from title to template name are specified in [specialsections] sections in the Asciidoc language configuration files (lang-*.conf). Section entries are formatted like:

<title>=<template>

<title> is a Python regular expression and <template> is the name of a configuration file markup template section. If the <title> matches an AsciiDoc document section title then the backend output is marked up using the <template> markup template (instead of the default sect<level> section template). The {title} attribute value is set to the value of the matched regular expression group named title, if there is no title group {title} defaults to the whole of the AsciiDoc section title. If <template> is blank then any existing entry with the same <title> will be deleted.

8.5. Inline Elements

Inline document elements are used to format text and to perform various types of text substitution. Inline elements and inline element syntax is defined in the asciidoc(1) configuration files.

Here is a list of AsciiDoc inline elements in the (default) order in which they are processed:

Special characters
These character sequences escape special characters used by the backend markup (typically <, >, and & characters). See [specialcharacters] configuration file sections.
Quotes
Elements that markup words and phrases; usually for character formatting. See [quotes] configuration file sections.
Special Words
Word or word phrase patterns singled out for markup without the need for further annotation. See [specialwords] configuration file sections.
Replacements
Each replacement defines a word or word phrase pattern to search for along with corresponding replacement text. See [replacements] configuration file sections.
Attribute references
Document attribute names enclosed in braces are replaced by the corresponding attribute value.
Inline Macros
Inline macros are replaced by the contents of parametrized configuration file sections.


[1] This is a rough structural guide, not a rigorous syntax definition