<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html > <head> <title>A brief introduction to TeX4ht</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <meta name="generator" content="TeX4ht (http://www.cse.ohio-state.edu/~gurari/TeX4ht/)"> <meta name="originator" content="TeX4ht (http://www.cse.ohio-state.edu/~gurari/TeX4ht/)"> <!-- html --> <meta name="src" content="tex4ht_doc.tex"> <meta name="date" content="2008-02-12 21:25:00"> <link rel="stylesheet" type="text/css" href="tex4ht_doc.css"> </head><body > <div class="maketitle"> <h2 class="titleHead"> A BRIEF INTRODUCTION TO TEX4HT </h2><div class="authors"><span class="author" > <span class="cmr-8">KAPIL HARI PARANJAPE</span> </span></div> <div class="submaketitle"> </div> </div> <h3 class="sectionHead"><span class="titlemark">1. </span> <a id="x1-10001"></a>What do we have here?</h3> <!--l. 31--><p class="noindent" >What follows is a brief introduction to the TeX4ht system designed and currently maintained by Eitan M. Gurari. The source for this document is in the file <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht_doc.tex</span></span></span> and can be processed using the command <span class="obeylines-h"><span class="verb"><span class="cmtt-10">htlatex</span><span class="cmtt-10"> tex4ht_doc.tex</span></span></span> as explained below. It is hoped that such processing will prove instructive as well. <h3 class="sectionHead"><span class="titlemark">2. </span> <a id="x1-20002"></a>Executive summary</h3> <!--l. 39--><p class="noindent" >TeX4ht is a system to convert TeX input into hypertext documents of different kinds. TeX4ht operates on input that is “standard” <span class="TEX">T<span class="E">E</span>X</span> or <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span class="E">E</span>X</span></span>(but please check the last section for some differences). This input is processed by <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex</span></span></span> in the usual way except that certain additional macros are loaded which create some hooks in the output that can be used to produce the hypertext. The output is then post-processed by the program <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht</span></span></span> which produces the hypertext. Auxiliary files such as <span class="obeylines-h"><span class="verb"><span class="cmtt-10">.css</span></span></span> files and image files are produced by the program <span class="obeylines-h"><span class="verb"><span class="cmtt-10">t4ht</span></span></span>. <!--l. 49--><p class="indent" > Usage is simplified via the Perl script <span class="obeylines-h"><span class="verb"><span class="cmtt-10">mk4ht</span></span></span> which can be called directly to combine the above operations transparently. For example the source of this document can be processed using <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht htlatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 54--><p class="nopar" > This will produce <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht_doc.html</span></span></span> and some supplementary files which is the HTML version of this documentation. Similarly, <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht xhmlatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 59--><p class="nopar" > will produce the XML version with MATH-ML and <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht mzlatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 63--><p class="nopar" > will produce MATH-ML which uses fonts that are rendered well via the “Gecko” engine of <span class="obeylines-h"><span class="verb"><span class="cmtt-10">mozilla</span></span></span>. Additional such commands are <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht oolatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 68--><p class="nopar" > to a format that can be read by <span class="obeylines-h"><span class="verb"><span class="cmtt-10">OpenOffice</span></span></span> and <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht dblatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 72--><p class="nopar" > for DocBook and <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht teilatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 76--><p class="nopar" > for TEI format XML output. The broad structure of the <span class="obeylines-h"><span class="verb"><span class="cmtt-10">mk4ht</span></span></span> command-line is <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht #1 #2 #3 #4 #5</div> </td></tr></table> <!--l. 81--><p class="nopar" > The first argument is the type of conversion required. Using <span class="obeylines-h"><span class="verb"><span class="cmtt-10">mk4ht</span></span></span> without arguments lists the conversions available. The second argument is the name of the file that is to be processed. The third, fourth and fifth arguments are optional and are described is some detail below. <!--l. 88--><p class="indent" > The rest of this document introduces the system in a little more detail. See <span class="cite">[<a href="#Xauthdoc">1</a>]</span> and <span class="cite">[<a href="#Xwebsite">2</a>]</span> for authoritative information. In the first following section (Section <a href="#x1-30003">3<!--tex4ht:ref: style --></a>) we examine the options for modifying the way in which <span class="TEX">T<span class="E">E</span>X</span> processes the source; specifically these can be thought of as options for the macros in <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht.sty</span></span></span>. The next section (Section <a href="#x1-40004">4<!--tex4ht:ref: postproc --></a>) deals with the post-processing that converts <span class="TEX">T<span class="E">E</span>X</span>’s output into hypertext. The final section (Section <a href="#x1-50005">5<!--tex4ht:ref: supple --></a>) shows how one can change the way the system generates the supplementary files like images and style-sheets for the hypertext output. <!--l. 99--><p class="indent" > This document is assumes that the reader has some familiarity with the <span class="TEX">T<span class="E">E</span>X</span> and <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span class="E">E</span>X</span></span> systems; see <span class="cite">[<a href="#Xtex">3</a>]</span> and <span class="cite">[<a href="#Xlatex">4</a>]</span> for more information. <h3 class="sectionHead"><span class="titlemark">3. </span> <a id="x1-30003"></a>Options for Styles</h3> <!--l. 104--><p class="noindent" >Options for <span class="TEX">T<span class="E">E</span>X</span> and <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span class="E">E</span>X</span></span> processing can be added as the first optional argument (<span class="obeylines-h"><span class="verb"><span class="cmtt-10">#3</span></span></span> above) to the <span class="obeylines-h"><span class="verb"><span class="cmtt-10">mk4ht</span></span></span> command. For example, the command <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht xhmlatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 109--><p class="nopar" > is in fact similar<span class="footnote-mark"><a href="tex4ht_doc2.html#fn1x0">1</a></span><a id="x1-3001f1"></a> to the command <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht htlatex tex4ht_doc.tex "xhtml,mathml" </div> </td></tr></table> <!--l. 114--><p class="nopar" > Similarly, <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht oolatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 118--><p class="nopar" > is in fact similar to the command <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht htlatex tex4ht_doc.tex "xhtml,ooffice" </div> </td></tr></table> <!--l. 122--><p class="nopar" > In most cases this list of options begins with <span class="obeylines-h"><span class="verb"><span class="cmtt-10">html</span></span></span> or <span class="obeylines-h"><span class="verb"><span class="cmtt-10">xhtml</span></span></span>. Additional options available can be found by searching for the string <span class="obeylines-h"><span class="verb"><span class="cmtt-10">---</span><span class="cmtt-10"> Note</span><span class="cmtt-10"> ---</span></span></span> at the start of a line in the resulting log file. For example <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht htlatex tex4ht_doc.tex  <br />grep -A 1 ’^--- Note ---’ tex4ht_doc.log </div> </td></tr></table> <!--l. 130--><p class="nopar" > will list all the available options for <span class="obeylines-h"><span class="verb"><span class="cmtt-10">html</span></span></span> conversion. <!--l. 133--><p class="indent" > When this list of options does not start with <span class="obeylines-h"><span class="verb"><span class="cmtt-10">html</span></span></span> or <span class="obeylines-h"><span class="verb"><span class="cmtt-10">xhtml</span></span></span> then the system looks for a file with the name given by the first option and the <span class="obeylines-h"><span class="verb"><span class="cmtt-10">.cfg</span></span></span> extension. The simplest use of this feature is as follows. Create a file called <span class="obeylines-h"><span class="verb"><span class="cmtt-10">bgimage.cfg</span></span></span> containing the lines <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> \Preamble{html}  <br />\begin{document}  <br />\Css{BODY { background-image : url(background.png); }}  <br />\EndPreamble </div> </td></tr></table> <!--l. 143--><p class="nopar" > After this <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht htlatex tex4ht_doc.tex "bgimage" </div> </td></tr></table> <!--l. 147--><p class="nopar" > will add an additional line to <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht_doc.css</span></span></span> incorporating the image <span class="obeylines-h"><span class="verb"><span class="cmtt-10">background.png</span></span></span>. See the main documentation <span class="cite">[<a href="#Xauthdoc">1</a>]</span> for more details on creating configuration files. <h3 class="sectionHead"><span class="titlemark">4. </span> <a id="x1-40004"></a>Post processing</h3> <!--l. 153--><p class="noindent" >The optional arguments <span class="obeylines-h"><span class="verb"><span class="cmtt-10">#4</span></span></span> and <span class="obeylines-h"><span class="verb"><span class="cmtt-10">#5</span></span></span> refer to options for the <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht</span></span></span> and <span class="obeylines-h"><span class="verb"><span class="cmtt-10">t4ht</span></span></span> commands respectively. Both these commands make use of the configuration file <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht.env</span></span></span> (which may be over-ridden by <span class="obeylines-h"><span class="verb"><span class="cmtt-10">.tex4ht</span></span></span> in the current directory or the user’s home directory). This configuration file is called the “environment file” in the main documentation <span class="cite">[<a href="#Xauthdoc">1</a>]</span> in order to avoid confusing it with the configuration file described in the previous section. <!--l. 162--><p class="indent" > The program <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht</span></span></span> has to look for “font descriptions” that describe how various non-standard glyphs are to be “rendered” in hypertext. The TeX4ht system provides a number of possibilities like using Unicode or fonts suited to the Gecko engine of the Mozilla browser and so on. So the command <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht mzlatex tex4ht_doc.tex </div> </td></tr></table> <!--l. 169--><p class="nopar" > is almost<span class="footnote-mark"><a href="tex4ht_doc3.html#fn2x0">2</a></span><a id="x1-4001f2"></a> equivalent to <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> mk4ht htlatex tex4ht_doc.tex "xhtml,mozilla" "-cmozhtf" </div> </td></tr></table> <!--l. 174--><p class="nopar" > The <span class="obeylines-h"><span class="verb"><span class="cmtt-10">-c<tagname></span></span></span> option for <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht</span></span></span> picks up the tagged section from the <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht.env</span></span></span> environment file. Any other command-line option of <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht</span></span></span> can also be used as part of <span class="obeylines-h"><span class="verb"><span class="cmtt-10">#4</span></span></span> which is just a space separated list of options for this command. <h3 class="sectionHead"><span class="titlemark">5. </span> <a id="x1-50005"></a>Creating Supplementary Files</h3> <!--l. 182--><p class="noindent" >The final step of conversion is the creation of supplementary files like image files for formulae and equations like <center class="math-display" > <img src="tex4ht_doc0x.png" alt="xn - 1 n∑-1 i x---1-= x i=0 " class="math-display" ></center> <!--l. 184--><p class="nopar" > which is the rendering of the <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span class="E">E</span>X</span></span> input string <table class="verbatim"><tr class="verbatim"><td class="verbatim"><div class="verbatim"> \[ \frac{x^n-1}{x-1} = \sum_{i=0}^{n-1} x^i \] </div> </td></tr></table> <!--l. 188--><p class="nopar" > In most cases such <span class="TEX">T<span class="E">E</span>X</span> constructions can only be rendered as images. The <span class="obeylines-h"><span class="verb"><span class="cmtt-10">tex4ht</span></span></span> program creates a series of instructions for the <span class="obeylines-h"><span class="verb"><span class="cmtt-10">t4ht</span></span></span> program in a <span class="obeylines-h"><span class="verb"><span class="cmtt-10">.lg</span></span></span> file. The latter carries out these instructions by making use of external programs like <span class="obeylines-h"><span class="verb"><span class="cmtt-10">dvipng</span></span></span> or <span class="obeylines-h"><span class="verb"><span class="cmtt-10">convert</span></span></span> to create these images. The most useful option in the argument list <span class="obeylines-h"><span class="verb"><span class="cmtt-10">#5</span></span></span> is <span class="obeylines-h"><span class="verb"><span class="cmtt-10">-p</span></span></span> which prevents images from being generated. Another useful option is <span class="obeylines-h"><span class="verb"><span class="cmtt-10">-cvalidate</span></span></span> which causes the net output to be validated using an external validation program such as <span class="obeylines-h"><span class="verb"><span class="cmtt-10">xmllint</span></span></span>. All the options in the argument list <span class="obeylines-h"><span class="verb"><span class="cmtt-10">#5</span></span></span> are passed on <span class="obeylines-h"><span class="verb"><span class="cmtt-10">t4ht</span></span></span>. <h3 class="sectionHead"><span class="titlemark">6. </span> <a id="x1-60006"></a>Some difference between TeX4ht and TeX</h3> <!--l. 201--><p class="noindent" >We document some differences between the systems. For more up-to-date information please see the author’s documentation<span class="cite">[<a href="#Xauthdoc">1</a>]</span>. <!--l. 204--><p class="noindent" ><span class="subsectionHead"><span class="titlemark">6.1. </span> <a id="x1-70006.1"></a><span class="cmbx-10">Regarding filenames.</span></span> In short, do <span class="cmti-10">not </span>use special characters in your filenames; ideally stick with filenames which are composed of standard ASCII alphanumerics wherever possible. Some explanations follow. <!--l. 209--><p class="indent" > <span class="TEX">T<span class="E">E</span>X</span> nowadays accepts files with names that contain all manner of characters and so it is natural to imagine that TeX4ht will do so to. However, one has to be concerned with the filenames used in output as well as those used for input. Since the latter will appear in URL’s that will appear within the hypertext using special characters will cause hyperlinks to break. Thus TeX4ht does not currently behave well if special characters are used in input file names. <!--l. 217--><p class="noindent" ><span class="subsectionHead"><span class="titlemark">6.2. </span> <a id="x1-80006.2"></a><span class="cmbx-10">Extra braces required.</span></span> In short, when in doubt enclosed sub- and super- scripts in braces if they are longer than a single character. <!--l. 221--><p class="indent" > In this respect the syntax of the TeX language that is accepted by TeX4ht is stricter than that accepted by <span class="TEX">T<span class="E">E</span>X</span> and <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span class="E">E</span>X</span></span>. <h3 class="sectionHead"><a id="x1-90006.2"></a>References</h3> <!--l. 224--><p class="noindent" > <div class="thebibliography"> <p class="bibitem" ><span class="biblabel"> <span class="cmr-8">[1]</span> <span class="bibsp"><span class="cmr-8"> </span><span class="cmr-8"> </span><span class="cmr-8"> </span></span></span><a id="Xauthdoc"></a> <a href="http://www.cse.ohio-state.edu/~gurari/mn.html" class="url" ><span class="cmtt-8">http://www.cse.ohio-state.edu/</span><span class="cmtt-8">~</span><span class="cmtt-8">gurari/mn.html</span></a> <span class="cmr-8">The authoritative documentation</span> <span class="cmr-8">maintained by Eitan M. Gurari.</span> </p> <p class="bibitem" ><span class="biblabel"> <span class="cmr-8">[2]</span> <span class="bibsp"><span class="cmr-8"> </span><span class="cmr-8"> </span><span class="cmr-8"> </span></span></span><a id="Xwebsite"></a> <a href="http://www.cse.ohio-state.edu/~gurari" class="url" ><span class="cmtt-8">http://www.cse.ohio-state.edu/</span><span class="cmtt-8">~</span><span class="cmtt-8">gurari</span></a> <span class="cmr-8">Eitan M.</span><span class="cmr-8"> Gurari’s web page that discusses</span> <span class="cmr-8">related projects.</span> </p> <p class="bibitem" ><span class="biblabel"> <span class="cmr-8">[3]</span> <span class="bibsp"><span class="cmr-8"> </span><span class="cmr-8"> </span><span class="cmr-8"> </span></span></span><a id="Xtex"></a> <a href="http://www.tug.org/" class="url" ><span class="cmtt-8">http://www.tug.org/</span></a> <span class="cmr-8">The </span><span class="TEX"><span class="cmr-8">T</span><span class="E"><span class="cmr-8">E</span></span><span class="cmr-8">X</span></span><span class="cmr-8"> User’s group primary web site.</span> </p> <p class="bibitem" ><span class="biblabel"> <span class="cmr-8">[4]</span> <span class="bibsp"><span class="cmr-8"> </span><span class="cmr-8"> </span><span class="cmr-8"> </span></span></span><a id="Xlatex"></a> <a href="http://www.latex-project.org/" class="url" ><span class="cmtt-8">http://www.latex-project.org/</span></a> <span class="cmr-8">The </span><span class="LATEX"><span class="cmr-8">L</span><span class="A"><span class="cmr-8">A</span></span><span class="TEX"><span class="cmr-8">T</span><span class="E"><span class="cmr-8">E</span></span><span class="cmr-8">X</span></span></span><span class="cmr-8"> project’s primary web site.</span></p></div> </body></html>