Humanities E-Book
The Online Collection
Developed By Scholars

XML Tagging Specifications

1. Files and Resources

1.1 DTD

All books must be tagged in XML using the acls-hebook.dtd. A list of all elements and attributes are defined below.

The DTD and specifications are periodically updated, so always check for the latest versions before beginning a new book.

DTD: acls-hebook.dtd ver. 1.7; date: 03-15-2015 (view | download zipped file)
Character Entities: xhtml-lat1.ent (view | download zipped file)
Specifications (this web page):

1.2 XML Template

For examples on tagging and usage, refer to the XML template, acls-hebook-template.xml.

XML Template: acls-hebook-template.xml ver. 1.5; date: 08-18-2009 (view | download | download zipped)

1.3 XSLT Style Sheet

An HEB-specific XSLT style sheet may be used to transform an XML file into HTML to check formatting, links, etc. (The resulting single-page HTML document differs slightly from the version that will ultimately appear online, but will be identical in all relevant aspects.) We strongly recommend using this tool for e-book testing prior to submitting the completed XML file. The file (acls-hebook.xsl) is available for download below. The best way to view results is by making use of the zipped folders and files, retaining the same configuration; the transformed HTML file should be placed in the “output” folder in the “xsl” directory (which also contains the XSLT style sheet itself); JPEGs for figures and other media should be placed in the “figures” folder. This way, all related files as well as styles will be correctly reflected in the output HTML file.

Please contact the HEB staff with any questions.

XSLT: acls-hebook.xsl ver. 1.6; date: 03-02-2010 (view | download | download zipped)

2. Encoding, Styling, Hyphenation

Save files in US-ASCII encoding. Add the encoding type to the XML declaration:

<?xml version=”1.0″ encoding=”us-ascii”?>

Special Characters

In order to ensure that our e-books are properly displayed across a range of standard web browsers, our system can currently only index and display the following characters:

Characters allowedRendered AsNotes
US-ASCII characters
Only characters in the US-ASCII range are allowed.
ISO 8859-1 Latin-1
entities (#160–#255)
Characters above US-ASCII must be encoded with entities (e.g., é for é) from xhtml-lat1.ent (see 1. Files and Resources above).
Make sure spacing and punctuation around … is correct, following book or publisher's in-house style. &hellips; will display as: [ . . . ]i.e.[]
< >
“ ”Curly double quotes will be converted to straight double quotes.
‘ ’ 'Curly single quotes and apostrophes will be converted to straight single quotes.
Ndash will be converted to hyphen.
 Use non-breaking space only when it is critical to retain original formatting (e.g., for historical documents). Do not use for general styling.

If your book includes special or foreign characters outside this range, please add additional entities at the top of your XML file.

Example of added entity at top of an XML file:

<!ENTITY oelig “&#x0153;”> <!– LATIN SMALL LIGATURE OE –>

We will need to work with the publisher and our programmers on how to index (for searching) and render the additional characters. Note that this should not affect the way you tag or prepare your text for print or other electronic versions. We just want to make it clear that our system can currently handle the entities listed above, and any additional characters will need to be dealt with as needed.


To tag italic, bold, struck out, underlined, superscript, or subscripted type, use the <hi1> tag with the appropriate rend attribute value (see list below). Please note that note reference numbers should NOT be tagged using superscipt (see 3.4.3. Links: Notes, Internal Links, URLs for tagging instructions).

<hi1 rend=”italic”>text</hi1>
<hi1 rend=”italicsunderlined”>text</hi1>
<hi1 rend=”bold”>text</hi1>
<hi1 rend=”bolditalic”>text</hi1>
<hi1 rend=”boldund”>text</hi1>
<hi1 rend=”strike”>text</hi1>
<hi1 rend=”und”>text</hi1>
<hi1 rend=”sup”>text</hi1>
<hi1 rend=”supbold”>text</hi1>
<hi1 rend=”supund”>text</hi1>
<hi1 rend=”sub”>text</hi1>

When adding styles to words/phrases followed by punctuation (e.g., “served as editor of the <hi1 rend=”italic”>Times</hi1>.”), please make an effort to be consistent in either including or excluding punctuation from <hi1></hi1> tag—provided this doesn’t conflict with the print version or publishing house style.
Small caps cannot be easily rendered in HTML browsers, so you must set small cap text in all caps, or convert text to title case. Publisher should advise conversion vendor on how to handle small caps. (It is usually preferable to convert to title case for <div> heads that will appear in the TOC.)


When converting titles from the print version, hyphens inserted to break a word at the end of a line as well as forced end of line breaks must be removed. For example, if “demo-” is on one line and “cracy” on the next, the hyphen should be removed so the word reads “democracy” with no hyphen.

3. Tagging Text

3.1 A General Note on XML Formatting

When tagging titles, please avoid as much as possible elaborate formatting and pretty-print features involving indents, tabs, and extra spaces. Characters such as bullets (&#8226;) and non-breaking spaces (&#160;) should be used only when it is critical to retain original formatting (e.g., for historical documents). Do not use for general styling. Please be especially mindful of avoiding forced line breaks and hard returns within tags. This will facilitate further processing of files at HEB.

3.2 Text Structure

3.2.1 Front, Body, Back

Break down the text using <front>, <body>, and <back> tags. Add mandatory HEB number heb900xx (last two digits are specific to title) and ISBN (E-Book) attributes to <text> tag. (Get this info from publisher/production editor.)

<text id=”heb90001″ isbn=”1-234-5678-9″>
<front>Title Page, Copyright, Dedication, etc.</front>
<body>Introduction, Chapter 1, Chapter 2, etc., Conclusion</body>
<back>Appendices, Notes, Bibliography, Index, etc.</back>

3.2.2 Divisions

Use division tags (<div1>, <div2>, <div3>, <div4>) to subdivide text within <front>, <body>, and <back>. For example, text with chapters and sections can be broken down as follows:

<div1 type=”chapter” id=”div1_c01″>
<div2 type=”section” id=”div2_c01.1″>
<div1 type=”part” id=”div1_pt1″>
<div2 type=”chapter” id=”div2_c01″>
<div3 type=”section” id=”div3_c01.1″>

Every division must have a type and id attribute, and include a <head> tag (see below).

Note that division tags can be used for any type of text subdivision, not necessarily just the traditional chapter and section. For many “born-digital” works or new projects that are developed simultaneously with print editions, the publisher may choose to break down text in other ways (e.g., smaller units).

3.2.3 Text Chunks

E-books in our system will be delivered in text chunks by the lowest available division level. For example:

1. If a book is broken down by chapter in <div1> and section in <div2>, text delivery should be by section (i.e., the lowest-level division), by adding the attribute status=”hidden” to the <div1> tag.

<div1 type=”chapter” id=”div1_c01″ status=”hidden”>
<div2 type=”section” id=”div2_c01.1″>
<div2 type=”section” id=”div2_c01.2″>

(To check if each <div> head has been properly tagged, transform XML documents using XSLT. All heads will be listed sequentially in the TOC. Those <div>s including the status=”hidden” attribute will not be hyperlinked, whereas those without the attribute will appear linked.)
2. For books without clear section breaks, you can break the chapter into smaller chunks (e.g., by tagging every 10 paragraphs in a separate division).

<div1 type=”chapter” id=”div1_c01″ status=”hidden”>
<div2 type=”section” id=”div2_c01.1″>Text for paragraphs 1-10.</div2>
<div2 type=”section” id=”div2_c01.2″>Text for paragraphs 11-20.</div2>

3.2.4 Milestone Section Breaks

If you want to add a separation between a series of paragraphs with a simple skipped line space or asterisks, you can put a milestone tag between paragraphs.

<p>[text text text]</p>
<milestone rend=”skipline”/> (for a blank line)
<milestone rend=”asterisk”/> (for asterisks)
<p>[text text text]</p>

3.2.5 Heads

Every division must include a <head> tag. Subparts of heads should be broken down by type in <bibl> tags. Paragraph number ranges for text chunks must also be placed inside a <bibl> tag.

It is only necessary to add paragraph-number ranges for the division level at which text chunks will be delivered, not for higher-level divisions that include the status=”hidden” attribute (i.e., if text in e-book will be delivered by section, only add paragraph-number ranges to section heads, not part or chapter heads).

For sections that include a byline containing specific author information—for example, if the Introduction was written by a contributor other than the main author—this info should also be added to the head in a separate <bibl type=”byline”> tag.

See example:

<div1 type=”chapter” id=”div1_c01″ status=”hidden”>
<bibl type=”number”>Chapter 1</bibl>
<bibl type=”title”>Chapter Title</bibl>
<bibl type=”subtitle”>Chapter Subtitle</bibl>
<div2 type=”section” id=”div2_c01.1″>
<bibl type=”title”>Section Title</bibl>
<bibl type=”byline”>by Author, translated by Translator</bibl>
<bibl type=”para”>22-39</bibl>

(Note: No period or other punctuation should be added after the final number appearing in <bibl type=”number”> before the </bibl> closing tag; a colon is added here automatically by our processing script, and adding anything else will result in redundant punctuation. For example, tag as <bibl type=”number”>1.1</bibl>, not <bibl type=”number”>1.1.</bibl>.)

Heads, bylines, and paragraph-number ranges (e.g., [para 1-10]) will appear in a hyperlinked Table of Contents. For new e-books in development, the publisher must make sure that all divisions have heads. In print version books, some sections do not have heads (e.g., first section at the beginning of a chapter, dedication page, etc.). Publishers should inform their vendors of the text to be inserted, for example, [Dedication], [Intro], or [No head in print version].

3.2.6 Pop-Up Divisions

It is possible to include a subset of <div>s that will be accessible only in pop-up windows, without appearing as part of the main text and hidden from view in the Table of Contents. In general, the type of content suitable for this treatment will adhere to the following parameters: a.) it is considered entirely supplementary/ancillary to the main text (i.e., skipping over it will not detract from the reading experience); b.) it can be easily referenced from various points throughout the main text. For example, if the auhor is quoting excerpts from letters or other historical documents, he or she may wish to include a reference to the full text of the letter in a pop-up <div> (e.g., “[Click to read full text]”). Another example might be including links to glossary-style definitions of terms used by the author, provided it is not considered essential for readers to be able to access the entire list of definitions as a separate section within the book.

A section of pop-up divisions should be the last section in the <back> matter (following 3.5.4. About the Author) and should be tagged as follows:

<div1 type=”popuptarget” id=”div1_pop” status=”nodisplay”>
<bibl type=”title”>Supplementary Text</bibl>
<div2 type=”popuptarget” id=”div2_pop.1″ status=”nodisplay”>
<bibl type=”title”>Section Head</bibl>
<p>[text text text]</p>

Please note:
1. Paragraphs in pop-up <div>s should not be numbered, as they are not part of the main sequence of text paragraphs (also see 3.4.1. Paragraphs).
2. <div2> is the lowest-level structural division available for tagging pop-ups; unlike in the main text, a <div2> within a <div1> should always be used to tag an individual pop-up, even if it is the only subdivision within that <div1>.
3. Elements within pop-up sections (e.g., <table>, <list>) can not be separately linked to; only the <div2> as a whole can be linked to (also see 3.4.3. Links).
4. Text within pop-up <div>s is currently not included in searches.

For further information on use of pop-up divisions, please contact HEB.

3.3 Front Matter

3.3.1 Series Title List

If your book is one in a series of titles, and the print version contains a list of already published titles in the series, please do NOT include this list in the front matter. (Instead, if desired, add information on the series to the Copyright and Permissions section—see below.)

3.3.2 Title Page

Begin tagging the <front> section with <titlepage> information. (The information contained within the <titlepage></titlepage> tags is metadata and will not appear as a separate section within the book. It is, however, drawn on in part for the title record page and digital “running heads”.) Do not tag the blank and half-title pages.

<titlepart type=”main”>Title of work</titlepart>
<titlepart type=”subtitle”>Subtitle of work</titlepart>
<docauthor>Author’s Name</docauthor>
<pubplace>Publisher’s location</pubplace>

Note that multiple author names should not apper in separate <docauthor> tags, but should be listed in the same tag, separated by commas.
This should be followed by a regular <div1> titlepage section (which may be derived from the print version) that will be visible to readers in the TOC. Tag as follows:

<div1 type=”titlepage” id=”div1_tpg”>
<head><bibl type=”title”>Title Page</bibl></head>
<p><hi1 rend=”bold”>Title of work: Subtitle of work</hi1></p>
<milestone rend=”skipline”/>
<p>Author’s Name</p>
<milestone rend=”skipline”/>
<p>Publisher’s location</p>

3.3.4 Table of Contents

The Table of Contents does not need to be tagged because a TOC will be generated and linked dynamically from the heads within each division.

3.3.5 List of Illustrations

A List of Illustrations is NOT generated dynamically, so you must tag this in a <div1> section. Publishers can choose which text portions to include here (we recommend using the <bibl type=”figcap”> portion of each <figure> only, omitting text tagged as <bibl type=”figsrc”>, and urge consideration of abbreviated descriptions for long figure captions). To create a basic List of Illustrations with links to each individual figure, tag as follows:

<div1 type=”figures” id=”div1_fig”>
<head><bibl type=”title”>List of Illustrations</bibl></head>
<item>Figure 1.1. Caption for figure. [Figure <ptr type=”txt” target=”fg_heb90001.0001″ n=”1.1″ />]</item>
<item>Figure 1.2. Caption for figure. [Figure <ptr type=”txt” target=”fg_heb90001.0002″ n=”1.2″ />]</item>

This section should follow Copyright and Permissions by default, unless there is a pre-existing List of Illustrations which appears in a different order in the print version of the book, and the publisher wishes to retain this order. If the figures are not interspersed in the book text, but rather displayed as a separate section (plates), you do not need to create a separate List of Illustrations. You can simply create a <div1> with the head “Illustrations” that contains all the figures. (See 3.4.5. Figures for specific tagging information.)

3.3.6 List of Audio / Video Clips

Publishers may choose to provide a List of Audio or Video Clips for titles including such media. Formatting should match the List of Illustrations, but link to a clip’s container element (e.g., <p>) rather than the clip’s (normally id-less) <ref>. (For more on tagging audio and video, see 3.4.6. Audio and Video Files.) Example:

<div1 type=”audio” id=”div1_aud”>
<head><bibl type=”title”>List of Music Clips</bibl></head>
<item>1. Caption for first clip. [Music Clip <ptr type=”txt” target=”p_102″ n=”1″ />]</item>
<item>2. Caption for second clip. [Music Clip <ptr type=”txt” target=”p_155″ n=”2″ />]</item>

3.3.7 Dedication, Acknowledgments, etc.

Tag other front-matter text in separate <div1> sections within the <front> section. For example:

<div1 type=”dedication” id=”div1_ded”>

<div1 type=”acknowledgments” id=”div1_ack”>

<div1 type=”epigraph” id=”div1_epi”>

<div1 type=”frontispiece” id=”div1_fnt”>

3.3.8 Preface to the Electronic Edition

Including a brief Preface to the Electronic Edition, containing such information as a description of elements specific to the e-book, links to external resources associated with the e-book, etc., may be helpful to readers. (Samples provided by HEB upon request.)

3.4. General Elements

3.4.1 Paragraphs

Use <p> to tag paragraphs. Paragraphs in the text should generally be assigned unique number and id values. In the e-book, the paragraph number (n value) will appear in the left margin next to the paragraph and will be used for identification and citation. Number each paragraph sequentially, beginning with the first paragraph of Acknowledgments (or the first significant front-matter text chunk) and continue numbering throughout the main text. You can also continue numbering through the back matter if it contains paragraphs in sections such as Appendices or About the Author.

Unnumbered <p> tags should be used in the following contexts:

1. Paragraphs appearing within notes (<note1> tags).
2. Paragraphs within extracts, such as epigraphs or quotations.
3. Paragraphs that are used just for styling, such as the <p> tags used on the title and copyright pages, those used for certain short lines of text (e.g., text lines that function as heads or footers, such as “<p>Source: …</p>”), and those used to set apart <pb> tags (see below).
4. Paragraphs within pop-up divisions.

Please check with HEB staff if any questions arise regarding paragraph-numbering.

<p n=”40″ id=”p_40″></p>
<p n=”41″ id=”p_41″></p>
<p n=”42″ id=”p_42″>
<p>[text text text]</p>

Numbered paragraphs will be rendered with justified margins and a text block width of 530 pixels.

3.4.2 Page Breaks

For titles that are also published in print, tag print-version page breaks with <pb> tags. Tagging page breaks gives readers a way to find citations based on the print-version pages. It is normally preferable to leave out <pb> tags for blank print pages, unless that page happens to be referenced elsewhere in the book.

Note that page breaks must be placed within the main text, not in or above head tags. Generally, a <pb> tag should placed as if to open a new text-portion rather than close the preceding one (e.g., place tag at the beginning of first <p> of new page rather than at end of last <p> of previous page). In some cases, proper line flow may be disrupted by the insertion of page numbers, and placement of <pb> tags occurring within/preceding certain elements—such as epigraphs or tables—might have to be adjusted, e.g., on/in a separate verse line, table row, or unnumbered paragraph (<p><pb n=”10″ id=”pb_10″></p>). Please confer with HEB staff if any questions arise about <pb> placement.

<div1 type=”chapter” id=”div1_c01″ status=”hidden”>
<head><bibl type=”title”>Chapter Title</bibl></head>
<div2 type=”section” id=”div2_c01.1″>
<head><bibl type=”title”>Section Title</bibl></head>
<p n=”25″ id=”p_25″><pb n=”16″ id=”pb_16″/>Paragraph text.</p>

Page breaks within text should follow this format:

Begining of sentence[space]<pb n=”121″ id=”pb_121″>rest of sentence.

3.4.4 Extracts: Quotations, Epigraphs

Use <q1> tags to tag extracts formatted as block quotations within a paragraph. Paragraphs within <q1> tags should NOT be numbered.

<p n=”20″ id=”p_20″>Paragraph text.
<p>Text of extract.</p>
Continuation of paragraph text.

If quote is formatted in lines (e.g., verse), use <l> or <lg> with <l>:

<l>First line.</l>
<l>Second line.</l>
<l>Third line.</l>

To tag epigraphs, place <epigraph> around the <q1> tag and add <bibl type=”epi”> to tag epigraph author and source. Epigraphs should follow opening tags of whichever divisional level text has been chunked by; e.g., if text is delivered by section, an epigraph opening a chapter should appear within the first <div2> (section-level) tags rather than <div1> (chapter-level) tags.

<div1 type=”chapter” id=”div1_c01″ status=”hidden”>
<head><bibl type=”number”>Chapter 1</bibl></head>
<div2 type=”section” id=”div2_c01.1″>
<head><bibl type=”para”>1-10</bibl></head>
<p>Text of epigraph.</p>
<bibl type=”epi”>&#8212;Quotation author, <hi1 rend=”italic”>Title of quotation source</hi1></bibl>

3.4.5 Figures

Image files should be named in accordance with the title’s HEB number, followed by the figure number. (Note that the letters “heb” must appear in lower case in figure entity names in order to be processed by our system. Other letters appearing in entities—e.g., heb90001.001a—will NOT be processed by our system and should be avoided.) Keep in mind that figure entity names don’t necessarily correspond with figure numbers as they appear in the text (e.g., Figure 2.1, the first figure appearing in Chapter 2 and 12th figure total, might be heb90001.0012 or heb90001.0201).

Please provide all of the following image files:

1. High-Resolution Tiff files
High-resolution tiff files must be submitted for all images. Publishers can submit the high resolution files used for print production. If high-resolution tiff files are not readily available or difficult to create, please discuss the situation with a technical contact at HEB. These files will be used to create files for the image-viewer tool, and archived in case jpeg files need to be reproduced.
Format/Resolution: TIFF, 300–600 dpi (depending on image type); image quality: high
Size: Variable
File name: heb90001.0001.tif

2. Small JPEG Files
Small JPEG files appear within the text, either embedded in paragraph text or as a series of images in a separate section.
Format/Resolution: JPEG, 72 dpi; image quality: medium or high
Size: 300–450 pixels on the long side, depending on orientation/shape, text flow, clarity, and file size. Please check with HEB staff if questions arise.
File name: heb90001.0001.jpg

3. Large JPEG Files
Large JPEG images appear in a pop-up window when users click on the small JPEGs within a book. (Enlarged jpegs are not needed if image-viewer technology will be used to view large versions of images.)
Format/Resolution: JPEG, 72 dpi; image quality: high
Size: 600–750 pixels on the long side, depending on what is needed for clarity. In cases where images include a lot of detail (e.g., maps), larger images might be required.
File name: heb90001.0001-lg.jpg

NOTE: Before submitting, make sure all images are a.) rotated by 90 degrees to correct orientation if necessary, and b.) properly cropped—i.e., remove extra white space around figure, frames, etc.
Tip: A quick way to convert image files in one batch is using Adobe Photoshop’s Web Photo Gallery feature. First rename your image files using the figure entity name, e.g., heb90001.0001 (or heb90001.0001-lg for large jpegs). From the File menu, select Automate, Web Photo Gallery. Select location of original files and destination of converted files. Select Large Images from Options pull-down, click on Re-size Images, select Custom in pull-down, and enter 300 (or 600) pixels; then select Contrain: Both, and Quality: Medium (or High). A website with converted images will be created. You can then submit these converted images.

Small jpegs will appear embedded within the text wherever a <figure> tag is inserted. Users will be able to click on these to open a pop-up window showing a larger version of the image. For each title, publishers can select several ways for users to enlarge an image:

1. Simple pop-up: Pop-up window brings up a larger version of the image (large jpeg). Recommended for most images.

2. Image viewer: Pop-up window that shows an image viewer that allows users to zoom in and pan on images (tiff or suitably large jpeg, without the -lg extension; if the latter, ensure these are submitted in a separate folder from the jpegs used for simple pop-ups). Recommended for titles with high-resolution art images or detailed line drawings or maps.

3. External image: Pop-up window opens external URL showing enlarged image. (Used for images housed within other online collections.)

Note: In certain cases, image enlargement may not be desired (e.g., when using a simple graph or icon for illustration purposes only). This can be accommodated by uploading small JPEGs only upon release to ensure that the enlargement option will not be activated in the online title. Check with HEB for further info.

Tagging Figures in XML

Place figure entity declarations at the top of the XML file:

<!ENTITY heb90001.0001 SYSTEM “heb90001.0001.jpg” NDATA jpeg>

Tag figures within the text as described below. All figure tags must be placed within <p> tags. Break down caption information by figure number, caption, and source/permissions; in some cases, source-information will be incorporated into the main caption text, and it may be desirable to omit <bibl type=”figsrc”> altogether.

<p>[text text text] <figure entity=”heb90001.0001″ id=”fg_heb90001.0001″>
<bibl type=”figno”>Figure 1.</bibl>
<bibl type=”figcap”>Caption for figure.</bibl>
<bibl type=”figsrc”>Source info/permission for figure.</bibl>
</figure>[text text text] </p>

Note: Please make sure each <figure> includes a <head> tag. This is necessary for further processing at HEB, but the tag may be left blank if inclusion of a caption is not desired.

In addition to the above, for images to be viewed with the image-viewer tool, type attribute type=”ic” should be added to the <figure> tag (<figure entity=”heb90001.0001″ id=”fg_heb90001.0001″ type=”ic”>). Publishers are required to submit certain data pertaining to “ic” images, to be displayed in the image-viewer interface, in a separate spreadsheet (this information is not tagged in the XML file itself, but housed in a separate database). For external images, type attribute type=”ext” should be added (<figure entity=”heb90001.0001″ id=”fg_heb90001.0001″ type=”ext”>). All external images will also require delivery to HEB of target URLs in a separate spreadsheet. Download a basic template of the spreadsheet required for “ic” and “ext” with two samples here. Please contact HEB for more information.

(A third attribute that may be added to the <figure> tag is type=”imagemap” [<figure entity=”heb90001.0001″ id=”fg_heb90001.0001″ type=”imagemap”>]. This applies only to figures classified as interactive images; an additional requirement for these is supplying HEB with coordinates for points in the image from which interactive links will originate. Please be sure to contact HEB for details before applying this attribute to any figures.)

If figures appear in a separate section (plates), then just tag as a <div1> section and tag each figure in a <p> tag. It is recommended to move such sections to the front matter (where normally the List of Illustrations would appear) for organizational purposes.

Note: jpg extensions should NOT be included for figure entity names listed within <figure> tags.

3.4.6 Audio and Video Files

Audio and video clips in several standard formats may be included (e.g., .mp3 and .mov). HEB recommends file size on individual clips be kept to 20 MB and under to minimize download times (longer clips may need to be broken down into several subcomponents prior to submission). Dimensions for video should ideally be 320 X 240 pixels to match the default pop-up window in which clips will be displayed, but can be larger if required. Clips formatted to the above specifications should be named in accordance with the title’s unique HEB number.

Tag clips using <ref> tags, as follows:

Listen to music clip 1, <ref type=”audio” filename=”heb90001.0001.mp3″>”Title of Clip”</ref>.

View video clip 1, <ref type=”video” filename=””>”Title of Clip”</ref>.

In order to reference clips within a text, link to the container-element (e.g., <p> or <table type=”insert”>).

(Also see video clip <ptr type=”txt” target=”p_158″ n=”2″/>.)

The same principle applies to creating a List of Audio or Video Clips (see 3.3.6. List of Audio / Video Clips).

(Note: Flash animation may also be included, using a type=”flash” attribute and listing specific parameters concerning dimensions, etc., as values under rend—for example: rend=”quality:high;width:320;height:240″. Please confer with HEB regarding specifics if the inclusion of flash is desired.)

3.4.7 PDFs

Publishers may wish to include additional text or image resources in the form of PDF documents. Like figures and other media, these files must be named in accordance with the title’s HEB ID number. Here is an example of PDF tagging:

Click to view <ref type=”pdf” filename=”heb90001.0001.pdf”>sample pages</ref> of the original manuscript.

3.4.8 Tables and Inserts

Use <table> tags for actual tables, or for text formatted as a table. (Adding heads is optional; the id attribute is necessary if a table will be subsequently referenced/linked.) The border attribute may be used for formatting if desired, as well as colspan and rowspan attributes for cells. Do not repeat column or row header cells if table spans several pages (as they often are in the print version in such cases).

<table id=”tb_1″ border=”1″>
<head><bibl type=”title”>Table Head</bibl></head> (head optional)

The attribute type=”insert” may be used with the <table> element to section off a specific text portion (e.g., letter, historical document, or text box) from the main text as a text block with a border around it. The insert text should be placed within a single <cell> with attribute type=”letter”. Often, using extracts or regular paragraphs may be preferable to using this type of formatting.

<table type=”insert” id=”in_1″>
<head> (head optional)
<bibl type=”number”>Insert Number</bibl>
<bibl type=”title”>Insert Head</bibl>
<row><cell type=”letter”>
<p>[text text text]</p>
<p>[text text text]</p>

Note: Please make sure the insert id value includes the “in_” prefix (rather than “tb_” for standard tables). This is necessary for further processing at HEB.

3.4.9 Lists

Use <list> tag for lists. (Heads are optional; the id attribute is required if a list will be subsequently referenced/linked.) You can also nest lists within list items. Items in a list will be formatted with a hanging indent.

<list id=”ls_1″> (id optional)
<head><bibl type=”title”>List Head</bibl></head> (head optional)

3.4.10 Salute, Signed, Dateline

Text constituting a salutation or signature (e.g., a greeting prefixed to a letter or the closing salutation appended to a foreword) should be tagged using the elements <salute> and <signed>. The element <dateline> should be used to tag the date and/or location prefixed or appended to a letter, transcript, or other document, as well as any text appearing outside this context that serves a similar purpose (e.g., a location/date functioning as a heading of sorts but which is NOT tagged as an actual <head>). The attributes align=”center” or align=”right” may be added to all three elements for formatting purposes (left alignment is default).

<dateline>Location, Date</dateline>
<salute>Dedicated to …</salute>
<p n=”7″ id=”p_7″>Text of dedication.</p>
<signed>Author’s name</signed>
<signed>Institution name</signed>

3.5 Back Matter

3.5.1 Notes

In the <back> section, create a <div1> with type=”notes”. Place notes for each chapter in separate <div2> sections (each of these <div2>s must also include type value “notes”). Each note id should follow this format: “[xxx].n[notenumber]” where [xxx] is “int” or “c01”, “c02”, etc.

<div1 type=”notes” id=”div1_nts”>
<div2 type=”notes” id=””>
<note1 n=”1″ id=”nt_int.n1″>
<p>Text of note.</p>
. . .
<note1 n=”10″ id=”nt_int.n10″>
<p>Text of note.</p>

Handling “Ibid.” in Notes

Since end notes will appear as pop-up windows, for notes that include the word “Ibid.” users will not see the referenced note in the pop-up window. We suggest you replace the word “Ibid.” with the referenced text, commenting out “Ibid.” and commenting where inserted text begins and ends. Note that often it’s not such a clear-cut copy and paste replacement, because the referenced note can include lengthy text or multiple books. The question the publisher will need to work out is which portion of the previous note should replace the “Ibid.” (If this is too difficult, it may be preferable to leave “Ibid.” in place.)

<note1 n=”10″ id=”nt_c01.10″> <p>Jones and Smith, <hi1 rend=”italic”>History of the United States</hi1>, Chapters 1 and 2.</p></note1>
<note1 n=”11″ id=”nt_c01.11″> <p><!–Ibid.–><!–begin insert–>Jones and Smith, <hi1 rend=”italic”>History of the United States</hi1>, Chapters 1 and 2. <!–end insert–></p></note1>

For new online titles, publishers may want to consider ending the usage of terms such as “Ibid.” and “Op. cit.” in their house style, so that notes can be more efficiently processed in the electronic version.

3.5.2 Bibliography

Within the <back> section, create a <div1> section with type=”bibliography”. If there are multiple sections in the bibliography, create subsections using <div2>, etc. All <bibl> tags (starting with the Bibliography, not those appearing in epigraphs) should be sequentially id’d. The “bib_” prefix must appear in id values to ensure proper display and functioning. Remove any 3-em dashes used in print instead of repeated authors’ names and repeat actual names instead.

<div1 type=”bibliography” id=”div1_bib”>
<head><bibl type=”title”>Bibliography</bibl></head>
<bibl id=”bib_1″> … </bibl>
<bibl id=”bib_2″> … </bibl>

3.5.3 Index

To tag an index, create a <div1> section with type=”index”. Create a main <list>, then put each letter into a separate sub-nested list (placed within the main list’s individual <item> tags). Sub-nested lists for terms should be placed within yet another <list> under the term’s <item> tag.

To link a page number to a specific page in the text, use the empty <ptr> tag, and add “txt” in a type attribute, the page number in an n attribute, and the page break id in a target attribute. (If an index is being created for a born-digital book, paragraph links can be used instead.) Our system will take the n attribute value and hyperlink the number to the targeted page break in the text. For page ranges (e.g., “30-35″), only the first page should be tagged.

Note references in Index: For note references, the page number rather than the note number should be tagged. (There may be some exceptions to this rule, such as e-books derived from print titles using footnotes rather than endnotes, in which case page numbers for notes are rendered obsolete, as these should have been moved to a separate backmatter section in the HEB edition. In this event, please confer with HEB staff prior to tagging.)

Figure references in Index: Instead of <hi1 rend=”italic”>, use [fig.] after a page link to designate a figure reference. Alternatively, such page links may be converted to direct figure links (see 3.3.5. List of Illustrations for tagging instructions).

<div1 type=”index” id=”div1_ind”>
<head><bibl type=”title”>Index</bibl></head>
<item>Apples, <ptr type=”txt” target=”pb_1″ n=”1″/>, <ptr type=”txt” target=”pb_15″ n=”15″/>-17</item>
<item>Apricots, <ptr type=”txt” target=”pb_133″ n=”133″/></item>
<item>Bananas, <ptr type=”txt” target=”pb_34″ n=”34″/>n18</item>
<item>Blueberries, <ptr type=”txt” target=”pb_68″ n=”68″/> [fig.], <ptr type=”txt” target=”pb_69″ n=”69″/></item>

NOTE: Indices containing a large number of <ptr> links may lead to very long load times. It is therefore advisable to break overlong indices down into several <div2> subsections (one option is to create a new subsection roughly every 1000 <ptr>s, by letter: e.g., the result might be a 3-section index with the sub-headings “A-G”, “H-O”,”P-Z”).

3.5.4 About the Author

As the final <back> matter section, create a <div1> section with information about the author(s). (An author’s photo may be included, if desired.)

<div1 type=”aboutauthor” id=”div1_aut”>
<bibl type=”title”>About the Author</bibl>
<bibl type=”para”>999</bibl>
<p><figure entity=”heb90001.0100 id=”fg_heb90001.0100″><head>
<bibl type=”figcap”>Author’s name.</bibl>
<bibl type=”figsrc”>Author Photo: Photographer’s name.</bibl>
<p>[text text text]</p>

4. Proof and Quality Control

XML files must be quality checked and proofread before submission to ACLS Humanities E-Book. We have provided a proofing XSLT style sheet to help view XML in a format closer to the final online version.

5. List of Elements

The following list shows all the elements defined for the ACLS Humanities E-Book acls-hebook.dtd. This list is an edited subset of the elements in the TEI Lite XML DTD.

contains any appendices, notes, bibliography, index, etc., following the main part of a text.

contains either a.) a sub-component of a heading, b.) the name(s) of the author(s) of an epigraph, or c.) a bibliographic citation.

contains the whole body of a single unitary text, excluding any front or back matter.

contains one cell of a table.

contains a brief description of the place, date, and/or time of production of a document (usually prefixed or suffixed to it as a kind of heading or trailer).

<div1> … <div4>
contains a first, second, third, fourth level subdivision of the front, body, or back of a text.

contains the name of the author of the document, as given on the title page.

contains the imprint statement (<publisher> name, place of publication <pubplace>), as given (usually) at the foot of a title page.

contains the title of a document, including all its constituents, as given on a title page. Must be divided into<titlepart> elements.

contains a quotation, anonymous or attributed, appearing at the start of a section or chapter, or on a title page.

marks the spot at which a graphic is to be inserted in a document. Attributes may be used to indicate an entity containing the image itself (in some non-XML notation); <head> within the <figure> element may be used to transcribe captions.

used to designate any prefatory matter before the start of the text proper.

contains any heading, for example the title of a section, or the heading of a list or glossary (sub-components should be tagged individually with <bibl> tags).

used to tag text as italicized (<hi1 rend=”italic”>), bold (<hi1 rend=”bold”>), struck out (<hi1 rend=”strike”>), underlined (<hi1 rend=”und”>), superscript (<hi1 rend=”sup”>), subscripted (<hi1 rend=”sub”>), or one of several combinations thereof (<hi1 rend=”bolditalic”>, <hi1 rend=”supund”>, <hi1 rend=”supbold”>, <hi1 rend=”boldund”>, <hi1 rend=”italicsunderlined”>).

contains one component of a list.

contains a single, possibly incomplete, line of verse.

contains a group of verse lines functioning as a formal unit, e.g., a stanza, refrain, verse paragraph, etc.

contains any sequence of items organized as a list, whether of numbered, bulletted, or other type.

marks a boundary between two sections of a text that does NOT constitute the start of a new division-based section. Attributes are skipline and asterisk.

contains a note or annotation.

marks paragraphs in prose.

marks the boundary between one page of a text and the next in a standard reference system.

a pointer (reference) to another location in the current document in terms of one or more identifiable elements.

provides the name of the organization responsible for the publication or distribution of a bibliographic item.

contains the name of the place where a bibliographic item was published.

contains a quotation or apparent quotation.

a reference to another location in the current document, in terms of one or more identifiable elements, possibly modified by additional text or comment.

contains one row of a table.

contains a salutation or greeting prefixed to a foreword, dedicatory epistle, or other division of a text, or the salutation in the closing of a letter, preface, etc.

contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text.

contains text displayed in tabular form, in rows and columns, or as a block sectioned off from the main text by a border (inserts).

the entire body of the book.

contains the title page of a text, appearing within the front matter.

contains a subsection or division of the title of a work, as indicated on a title page; also used for free-floating fragments of the title page not part of the document title, authorship attribution, etc.

Attribute Chart

For processing purposes, it is useful to make the order in which attributes appear for any given element consistent throughout the text.

Note: For id attribute values, prefixes with underscores (e.g., p_, pb_) must be used, as listed below; our system will recognize only id’s with these prefixes.

(div head)
(figure head)
(epigraph source)
(div1_cpy, div1_pre...)
(div2_int.1, div2_nts.c01...)
(note reference)
(to web page)

6. Cover Image

You must submit a high-resolution TIFF file and two JPEG files for the cover image, which will appear on the Title Record Page of each book.

1. High-resolution tiff image: size: variable; format: TIFF, 300 dpi; image quality: high
File name: heb90001.0001.001.tif

2. Small jpeg image: width: 160 pixels; format: JPEG, 72 dpi; image quality: medium or high
File name: heb90001.0001.001.jpg

3. Large jpeg image for pop-up window: height: 650 pixels; format: JPEG, 72 dpi; image quality: high
File name: heb90001.0001.001-lg.jpg

7. Technical Contact at HEB

Nina Gielen (

8. HEB Specifications-Log of Specifications Updates

03-10-15 Update:

1. Updated DTD (corrected “rowspan” attribute from being associated with element <row> to element <cell>).

09-29-10 Updates:

New rules and additions:
1. Updated DTD and XSLT (see 1. Files and Resources).
2. Pop-Up Divisions: Supplementary content can now be included in the form of pop-ups hidden from view in the TOC and main text; sample tagging provided.
3. Paragraphs: Unnumbered paragraphs are to be used for new pop-up <div>s.
4. Links: Added instructions for linking to new pop-up <div>s.
5. Figures: Publishers can submit small JPEGs only if image enlargement is not desired.

1. Updated use of ACLS to HEB throughout.
2. Heads: Expanded instructions for <bibl type=”byline”>.
3. Copyright and Permissions: Updated sample tagging.
4. Links: Added note of tagging various media using <ref>.
5. Notes: Updated generic note-tagging schema.
6. Index: Corrected description of hyperlinked page-number rendering.
7. Attribute chart: Added “nodisplay” to <div1> and <div2>; added pop-up option to <ref>.

10-19-09 Update:

1. Updated XSLT.

08-18-09 Updates:

New rules and additions:
1. Updated DTD, XSLT, and template (see 1. Files and Resources).
2. Encoding: All characters above US-ASCII range must now be encoded using numeric entities only.
3. Figures: Both image viewer (type=”ic”) and external (type=”ext”) images now require submission of spreadsheet containing additional data; sample spreadsheet provided.
4. PDFs: Submission of additional resources in the form of PDF documents now permitted; sample tagging provided.

1. Subsections under 3.4. General Elements affected by new additions have been renumbered.
2. Removed references to and instructions for use of named character entities throughout (numeric character entities now required for all titles).
3. General Note on XML Formatting: Added note on avoiding use of certain characters for styling only.
4. Figures: Added note on using jpegs for image-viewer instead of tiffs.
5. Index: Expanded instructions on tagging note references.
6. Attribute chart: Added pdf option to <ref>.
7. Log of Specifications Updates: Removed link to page tracking previous versions of DTD (page no longer being updated).

03-09-09 Updates:

New rules and additions:
1. Updated DTD and XSLT (see 1. Files and Resources).
2. Audio and Video Files: Flash animation can now be included.
3. Title Page: Multiple authors must be listed in the same <docauthor> tag.
4. Figures: type=”ic” attribute figures may require submission of related info to HEB in a spreadsheet (depending on quantity of figures).
5. Tables and Inserts: Insert <table>s must use “in_” in id values.
6. Notes: All <div2> subsections in this section must include type=”notes”.
7. Bibliography: <bibl>s must use “bib_” in id values.

1. Files and Resources: Updated instructions for using XSLT to transform XML and review output HTML file.
2. All subsections under 3. Tagging Text have been renumbered.
3. General Note on XML Formatting: Avoid pretty-print formatting.
4. Title Page: Use of metadata explicated.
5. Former List of Audio / Film Clips renamed List of Audio / Video Clips.
6. List of Illustrations and List of Audio / Video Clips: added missing attributes to <div1>s in sample tagging.
7. Audio and Video Files: Guidelines on file types and sizes relaxed.

02-20-08 Updates:

New rules and additions:
1. Tagging Text: Added general note on XML file formatting.
2. Figures: All <figure>s now require <head>s.
3. Figures: Removed url attribute for type=”ext” figures and added information on alternate delivery of associated urls.

1. Audio / Video Files: .mp3 and .mov now considered preferred formats.
2. DTD, template, and XSLT updated; also added links for non-zipped versions.

04-30-07 Updates:

New rules and additions:
1. Styling: Added new option italic-underlined.
2. Figures: Type attribute added for interactive image-map images.

1. Heads: Added note on redundant punctuation in <bibl type=”number”> tag.
2. Audio / Video Files: Added .mov to list of video formats; softened dimension-requirements.
3. DTD, template, and XSLT updated.
4. Attribute Chart updated.

03-12-07 Update:

1. Copy throughout updated from History E-Book to Humanities E-Book.

09-05-06 Updates:

New rules and additions:
1. Updated DTD and template (see 1. Files and Resources).
2. Styling: Added new option subscript.
3. Structure: Added “List of Audio / Video Clips” to front matter.
4. Structure: Added “Preface to the Electronic Edition” to front matter.
5. Links: Added information on tagging multiple links for same term.
6. Figures: Use of letters in entity names now expressly prohibited (except “heb”).
7. Figures: Updated size requirements for small JPEGs.
8. Figures: Type attribute added for image-viewer images.
9. Figures: Added suggestion for placement of plate-section.
10. Audio / Video Files: New section created with instructions for submitting and tagging this media.

1. Renumbered sections affected by new additions.
2. Removed reference to obsolete link listing entity names.
3. Text Chunks: Necessity for delivery by lowest-available <div> emphasized.
4. Milestone Section Breaks: Corrected error in tagging sample. (Also renamed section, for clarity.)
5. Copyright and Permissions: Clarified language used in sample for information on print version.
6. List of Illustrations: Corrected error in tagging sample.
7. Paragraphs: Deleted obsolete mention of rend=”no-indent”.
8. Page Breaks: Guidelines for placement of <pb>s expanded.
9. About the Author: Corrected error in tagging sample.
10. List of Elements updated.
11. Attribute Chart updated.

12-01-04 Updates:

1. Updated DTD, template, XSLT.
2. Elements: <table> attribute type=”insert” (in conjunction with <cell type=”letter”>) replaces separate element <insert>.
3. Styling/<hi1> tag: hyphens deleted from rend=”…” values.
4. Technical contact updated.

09-22-04 Updates:

New rules and additions:
1. Updated DTD and template (see 1. Files and Resources).
2. Added XSLT (see 1. Files and Resources).
3. Elements: Added new elements <dateline>, <insert>.
4. Structure: “Titlepage” <div1> division added to front matter (following <titlepage>); back matter <div1> “About the Author” now required.
5. Special Characters: &nbsp; added to special characters.
6. <hi1> tag: Six new styling options added.
7. Page breaks: Rules for blank pages added.
8. Figures: Cropping of images prior to submitting now required.
9. Figures: Type attribute added for external images.
10. Tables: <border>, <rowspan>, and <colspan> attributes added.
11. Salute, signed, dateline: <align> attribute added.
12. Index: Breakdown into sections now required for overlong index.

1. Numbered all sections.
2. Linked cross-references.
3. Guidelines on punctuation with <hi1> tags added.
4. Guidelines on use of status=”hidden” added.
5. Suggested id’s for lists, tables, inserts refined.
6. Suggestions for use of figure captions in List of Illustrations refined.
7. Guidelines for use of unnumbered <p>s expanded.
8. Guidelines for placement of <pb>s refined.
9. Guidelines for internal linking expanded; corrected header for Links: Notes, Internal Links, URLs.
10. Use of <figsrc> clarified.
11. Linking to notes from Index addressed.
12. List of Elements updated.
13. Attribute Chart updated.
14. Technical contact e-mails updated.

09-30-03 Updates:

1. Cover specs: Submission of cover image as TIFF file (in addition to JPEGs) now required.

09-18-03 Updates:

New rules and additions:
1. ID attribute: New tags used for id (and target) throughout; now require underscore as part of prefix (e.g., p_10).
2. <hi1> tag: New styling option added: <hi1 rend=”bolditalic”>.
3. Heads: Byline attribute added to <bibl>-options.
4. Copyright and Permissions: Instructions have been expanded.
5. List of Illustrations: This now uses direct links to figures (as opposed to print-page number on which figure appears); location of section specified.
6. Paragraphs: “no-indent” attribute for paragraphs now obsolete.
7. Links: Now possible to link to divs, paragraphs, figures, tables, lists, bibl items.
8. Figures: Updates to figure specs (file names, size). Figures now require ids.
9. Index: Request for individually nested sublists, direct figure-link info added. Note-reference tagging in index changed from note number to page number.

1. <text>-tag (in front matter): Instructions for attributes have been refined.
2. Page breaks: Flexibility for location of <pb> for stylistic reasons in specific cases (e.g., extract lines).
3. Extracts: Location of epigraphs has been specified.
4. Salute, Signed: Subsection added to General Elements.
5. List of Elements updated.
6. Attribute Chart updated.

05-13-03 Updates:

1. Epigraph source—bibl tag in epigraph source should include attribute type=”epi”<bibl type=”epi”>
2. All images must be submitted with high resolution tiffs. Small edits to image specs for clarity.
3. Index tagging—removed specs to tag links to note id in index.

02-20-03 Updates:

1. Figures—clarify image size specs. Small images, maximum size 530.
2. Small text copyedits.