Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 8a0c95e246ae6b1f008ff48867f08c37 > files > 94

tex4ht-1.0.2008_02_28_2058-2mdv2009.0.i586.rpm

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"  
  "http://www.w3.org/TR/html4/loose.dtd">  
<html > 
<head>  <title>A brief introduction to TeX4ht</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> 
<meta name="generator" content="TeX4ht (http://www.cse.ohio-state.edu/~gurari/TeX4ht/)"> 
<meta name="originator" content="TeX4ht (http://www.cse.ohio-state.edu/~gurari/TeX4ht/)"> 
<!-- html --> 
<meta name="src" content="tex4ht_doc.tex"> 
<meta name="date" content="2008-02-12 21:25:00"> 
<link rel="stylesheet" type="text/css" href="tex4ht_doc.css"> 
</head><body 
>
  <div class="maketitle">
  <h2 class="titleHead">
A BRIEF INTRODUCTION TO TEX4HT
  </h2><div class="authors"><span class="author" >
<span 
class="cmr-8">KAPIL HARI PARANJAPE</span>
  </span></div>
<div class="submaketitle">
</div>
  </div>
  <h3 class="sectionHead"><span class="titlemark">1. </span> <a 
 id="x1-10001"></a>What do we have here?</h3>
<!--l. 31--><p class="noindent" >What follows is a brief introduction to the TeX4ht system designed and currently
maintained by Eitan M. Gurari. The source for this document is in the file
<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht_doc.tex</span></span></span> and can be processed using the command <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">htlatex</span><span 
class="cmtt-10">&#x00A0;tex4ht_doc.tex</span></span></span>
as explained below. It is hoped that such processing will prove instructive as
well.
  <h3 class="sectionHead"><span class="titlemark">2. </span> <a 
 id="x1-20002"></a>Executive summary</h3>
<!--l. 39--><p class="noindent" >TeX4ht is a system to convert TeX input into hypertext documents of different kinds.
TeX4ht operates on input that is &#8220;standard&#8221; <span class="TEX">T<span 
class="E">E</span>X</span>&#x00A0;or <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span 
class="E">E</span>X</span></span>(but please check
the last section for some differences). This input is processed by <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex</span></span></span> in the
usual way except that certain additional macros are loaded which create some
hooks in the output that can be used to produce the hypertext. The output is
then post-processed by the program <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht</span></span></span> which produces the hypertext.
Auxiliary files such as <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">.css</span></span></span> files and image files are produced by the program
<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">t4ht</span></span></span>.
<!--l. 49--><p class="indent" >  Usage is simplified via the Perl script <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">mk4ht</span></span></span> which can be called directly to combine
the above operations transparently. For example the source of this document can be
processed using
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;htlatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 54--><p class="nopar" > This will produce <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht_doc.html</span></span></span> and some supplementary files which is the HTML
version of this documentation. Similarly,
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;xhmlatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 59--><p class="nopar" > will produce the XML version with MATH-ML and
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;mzlatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 63--><p class="nopar" > will produce MATH-ML which uses fonts that are rendered well via the &#8220;Gecko&#8221; engine
of <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">mozilla</span></span></span>. Additional such commands are
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;oolatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 68--><p class="nopar" > to a format that can be read by <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">OpenOffice</span></span></span> and
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;dblatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 72--><p class="nopar" > for DocBook and
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;teilatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 76--><p class="nopar" > for TEI format XML output. The broad structure of the <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">mk4ht</span></span></span> command-line
is
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;#1&#x00A0;#2&#x00A0;#3&#x00A0;#4&#x00A0;#5</div>
</td></tr></table>
<!--l. 81--><p class="nopar" > The first argument is the type of conversion required. Using <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">mk4ht</span></span></span> without arguments
lists the conversions available. The second argument is the name of the file that is to be
processed. The third, fourth and fifth arguments are optional and are described is some
detail below.
<!--l. 88--><p class="indent" >  The rest of this document introduces the system in a little more detail. See&#x00A0;<span class="cite">[<a 
href="#Xauthdoc">1</a>]</span> and <span class="cite">[<a 
href="#Xwebsite">2</a>]</span>
for authoritative information. In the first following section (Section&#x00A0;<a 
href="#x1-30003">3<!--tex4ht:ref: style --></a>) we examine the
options for modifying the way in which <span class="TEX">T<span 
class="E">E</span>X</span>&#x00A0;processes the source; specifically these
can be thought of as options for the macros in <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht.sty</span></span></span>. The next section
(Section&#x00A0;<a 
href="#x1-40004">4<!--tex4ht:ref: postproc --></a>) deals with the post-processing that converts <span class="TEX">T<span 
class="E">E</span>X</span>&#8217;s output into hypertext.
The final section (Section&#x00A0;<a 
href="#x1-50005">5<!--tex4ht:ref: supple --></a>) shows how one can change the way the system
generates the supplementary files like images and style-sheets for the hypertext
output.
<!--l. 99--><p class="indent" >  This document is assumes that the reader has some familiarity with the <span class="TEX">T<span 
class="E">E</span>X</span>&#x00A0;and
<span class="LATEX">L<span class="A">A</span><span class="TEX">T<span 
class="E">E</span>X</span></span>&#x00A0;systems; see <span class="cite">[<a 
href="#Xtex">3</a>]</span> and <span class="cite">[<a 
href="#Xlatex">4</a>]</span> for more information.
  <h3 class="sectionHead"><span class="titlemark">3. </span> <a 
 id="x1-30003"></a>Options for Styles</h3>
<!--l. 104--><p class="noindent" >Options for <span class="TEX">T<span 
class="E">E</span>X</span>&#x00A0;and <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span 
class="E">E</span>X</span></span>&#x00A0;processing can be added as the first optional argument
(<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">#3</span></span></span> above) to the <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">mk4ht</span></span></span> command. For example, the command
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;xhmlatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 109--><p class="nopar" > is in fact similar<span class="footnote-mark"><a 
href="tex4ht_doc2.html#fn1x0">1</a></span><a 
 id="x1-3001f1"></a> 
to the command
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;htlatex&#x00A0;tex4ht_doc.tex&#x00A0;"xhtml,mathml"
</div>
</td></tr></table>
<!--l. 114--><p class="nopar" > Similarly,
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;oolatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 118--><p class="nopar" > is in fact similar to the command
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;htlatex&#x00A0;tex4ht_doc.tex&#x00A0;"xhtml,ooffice"
</div>
</td></tr></table>
<!--l. 122--><p class="nopar" > In most cases this list of options begins with <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">html</span></span></span> or <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">xhtml</span></span></span>. Additional options available
can be found by searching for the string <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">---</span><span 
class="cmtt-10">&#x00A0;Note</span><span 
class="cmtt-10">&#x00A0;---</span></span></span> at the start of a line in the
resulting log file. For example
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;htlatex&#x00A0;tex4ht_doc.tex
&#x00A0;<br />grep&#x00A0;-A&#x00A0;1&#x00A0;&#8217;^---&#x00A0;Note&#x00A0;---&#8217;&#x00A0;tex4ht_doc.log
</div>
</td></tr></table>
<!--l. 130--><p class="nopar" > will list all the available options for <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">html</span></span></span> conversion.
<!--l. 133--><p class="indent" >  When this list of options does not start with <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">html</span></span></span> or <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">xhtml</span></span></span> then the system looks for
a file with the name given by the first option and the <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">.cfg</span></span></span> extension. The simplest use
of this feature is as follows. Create a file called <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">bgimage.cfg</span></span></span> containing the
lines
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
\Preamble{html}
&#x00A0;<br />\begin{document}
&#x00A0;<br />\Css{BODY&#x00A0;{&#x00A0;background-image&#x00A0;:&#x00A0;url(background.png);&#x00A0;}}
&#x00A0;<br />\EndPreamble
</div>
</td></tr></table>
<!--l. 143--><p class="nopar" > After this
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;htlatex&#x00A0;tex4ht_doc.tex&#x00A0;"bgimage"
</div>
</td></tr></table>
<!--l. 147--><p class="nopar" > will add an additional line to <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht_doc.css</span></span></span> incorporating the image
<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">background.png</span></span></span>. See the main documentation <span class="cite">[<a 
href="#Xauthdoc">1</a>]</span> for more details on creating
configuration files.
  <h3 class="sectionHead"><span class="titlemark">4. </span> <a 
 id="x1-40004"></a>Post processing</h3>
<!--l. 153--><p class="noindent" >The optional arguments <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">#4</span></span></span> and <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">#5</span></span></span> refer to options for the <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht</span></span></span> and <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">t4ht</span></span></span> commands
respectively. Both these commands make use of the configuration file <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht.env</span></span></span> (which
may be over-ridden by <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">.tex4ht</span></span></span> in the current directory or the user&#8217;s home directory).
This configuration file is called the &#8220;environment file&#8221; in the main documentation <span class="cite">[<a 
href="#Xauthdoc">1</a>]</span> in
order to avoid confusing it with the configuration file described in the previous
section.
<!--l. 162--><p class="indent" >  The program <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht</span></span></span> has to look for &#8220;font descriptions&#8221; that describe how various
non-standard glyphs are to be &#8220;rendered&#8221; in hypertext. The TeX4ht system provides a
number of possibilities like using Unicode or fonts suited to the Gecko engine of the
Mozilla browser and so on. So the command
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;mzlatex&#x00A0;tex4ht_doc.tex
</div>
</td></tr></table>
<!--l. 169--><p class="nopar" > is almost<span class="footnote-mark"><a 
href="tex4ht_doc3.html#fn2x0">2</a></span><a 
 id="x1-4001f2"></a> 
equivalent to
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
mk4ht&#x00A0;htlatex&#x00A0;tex4ht_doc.tex&#x00A0;"xhtml,mozilla"&#x00A0;"-cmozhtf"
</div>
</td></tr></table>
<!--l. 174--><p class="nopar" > The <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">-c&#x003C;tagname&#x003E;</span></span></span> option for <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht</span></span></span> picks up the tagged section from the
<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht.env</span></span></span> environment file. Any other command-line option of <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht</span></span></span> can also
be used as part of <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">#4</span></span></span> which is just a space separated list of options for this
command.
  <h3 class="sectionHead"><span class="titlemark">5. </span> <a 
 id="x1-50005"></a>Creating Supplementary Files</h3>
<!--l. 182--><p class="noindent" >The final step of conversion is the creation of supplementary files like image files for
formulae and equations like
  <center class="math-display" >
<img 
src="tex4ht_doc0x.png" alt="xn - 1  n&#x2211;-1 i
x---1-=    x
        i=0
" class="math-display" ></center>
<!--l. 184--><p class="nopar" > which is the rendering of the <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span 
class="E">E</span>X</span></span>&#x00A0;input string
                                                                     

                                                                     
  <table 
class="verbatim"><tr class="verbatim"><td 
class="verbatim"><div class="verbatim">
\[&#x00A0;\frac{x^n-1}{x-1}&#x00A0;=&#x00A0;\sum_{i=0}^{n-1}&#x00A0;x^i&#x00A0;\]
</div>
</td></tr></table>
<!--l. 188--><p class="nopar" > In most cases such <span class="TEX">T<span 
class="E">E</span>X</span>&#x00A0;constructions can only be rendered as images. The <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">tex4ht</span></span></span>
program creates a series of instructions for the <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">t4ht</span></span></span> program in a <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">.lg</span></span></span> file. The latter
carries out these instructions by making use of external programs like <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">dvipng</span></span></span> or
<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">convert</span></span></span> to create these images. The most useful option in the argument list <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">#5</span></span></span>
is <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">-p</span></span></span> which prevents images from being generated. Another useful option is
<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">-cvalidate</span></span></span> which causes the net output to be validated using an external validation
program such as <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">xmllint</span></span></span>. All the options in the argument list <span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">#5</span></span></span> are passed on
<span class="obeylines-h"><span class="verb"><span 
class="cmtt-10">t4ht</span></span></span>.
  <h3 class="sectionHead"><span class="titlemark">6. </span> <a 
 id="x1-60006"></a>Some difference between TeX4ht and TeX</h3>
<!--l. 201--><p class="noindent" >We document some differences between the systems. For more up-to-date information
please see the author&#8217;s documentation<span class="cite">[<a 
href="#Xauthdoc">1</a>]</span>.
<!--l. 204--><p class="noindent" ><span class="subsectionHead"><span class="titlemark">6.1. </span> <a 
 id="x1-70006.1"></a><span 
class="cmbx-10">Regarding filenames.</span></span>
  In short, do <span 
class="cmti-10">not </span>use special characters in your filenames; ideally stick with filenames
which are composed of standard ASCII alphanumerics wherever possible. Some
explanations follow.
<!--l. 209--><p class="indent" >  <span class="TEX">T<span 
class="E">E</span>X</span>&#x00A0;nowadays accepts files with names that contain all manner of characters and so
it is natural to imagine that TeX4ht will do so to. However, one has to be concerned with
the filenames used in output as well as those used for input. Since the latter will appear
in URL&#8217;s that will appear within the hypertext using special characters will cause
hyperlinks to break. Thus TeX4ht does not currently behave well if special characters are
used in input file names.
<!--l. 217--><p class="noindent" ><span class="subsectionHead"><span class="titlemark">6.2. </span> <a 
 id="x1-80006.2"></a><span 
class="cmbx-10">Extra braces required.</span></span>
  In short, when in doubt enclosed sub- and super- scripts in braces if they are longer
than a single character.
<!--l. 221--><p class="indent" >  In this respect the syntax of the TeX language that is accepted by TeX4ht is stricter
than that accepted by <span class="TEX">T<span 
class="E">E</span>X</span>&#x00A0;and <span class="LATEX">L<span class="A">A</span><span class="TEX">T<span 
class="E">E</span>X</span></span>.
  <h3 class="sectionHead"><a 
 id="x1-90006.2"></a>References</h3>
<!--l. 224--><p class="noindent" >
   <div class="thebibliography">
   <p class="bibitem" ><span class="biblabel">
<span 
class="cmr-8">[1]</span>  <span class="bibsp"><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span></span></span><a 
 id="Xauthdoc"></a>   <a 
href="http://www.cse.ohio-state.edu/~gurari/mn.html" class="url" ><span 
class="cmtt-8">http://www.cse.ohio-state.edu/</span><span 
class="cmtt-8">~</span><span 
class="cmtt-8">gurari/mn.html</span></a> <span 
class="cmr-8">The authoritative documentation</span>
   <span 
class="cmr-8">maintained by Eitan M. Gurari.</span>
                                                                     

                                                                     
   </p>
   <p class="bibitem" ><span class="biblabel">
<span 
class="cmr-8">[2]</span>  <span class="bibsp"><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span></span></span><a 
 id="Xwebsite"></a>  <a 
href="http://www.cse.ohio-state.edu/~gurari" class="url" ><span 
class="cmtt-8">http://www.cse.ohio-state.edu/</span><span 
class="cmtt-8">~</span><span 
class="cmtt-8">gurari</span></a> <span 
class="cmr-8">Eitan M.</span><span 
class="cmr-8">&#x00A0;Gurari&#8217;s web page that discusses</span>
   <span 
class="cmr-8">related projects.</span>
   </p>
   <p class="bibitem" ><span class="biblabel">
<span 
class="cmr-8">[3]</span>  <span class="bibsp"><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span></span></span><a 
 id="Xtex"></a>  <a 
href="http://www.tug.org/" class="url" ><span 
class="cmtt-8">http://www.tug.org/</span></a> <span 
class="cmr-8">The </span><span class="TEX"><span 
class="cmr-8">T</span><span 
class="E"><span 
class="cmr-8">E</span></span><span 
class="cmr-8">X</span></span><span 
class="cmr-8">&#x00A0;User&#8217;s group primary web site.</span>
   </p>
   <p class="bibitem" ><span class="biblabel">
<span 
class="cmr-8">[4]</span>  <span class="bibsp"><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span><span 
class="cmr-8">&#x00A0;</span></span></span><a 
 id="Xlatex"></a>  <a 
href="http://www.latex-project.org/" class="url" ><span 
class="cmtt-8">http://www.latex-project.org/</span></a> <span 
class="cmr-8">The </span><span class="LATEX"><span 
class="cmr-8">L</span><span class="A"><span 
class="cmr-8">A</span></span><span class="TEX"><span 
class="cmr-8">T</span><span 
class="E"><span 
class="cmr-8">E</span></span><span 
class="cmr-8">X</span></span></span><span 
class="cmr-8">&#x00A0;project&#8217;s primary web site.</span></p></div>
   
</body></html>