Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 8a0c95e246ae6b1f008ff48867f08c37 > files > 95

tex4ht-1.0.2008_02_28_2058-2mdv2009.0.i586.rpm

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% tex4ht_doc.tex                       2008-02-12-09:30  %
% Copyright (C) 2005, 2008      Kapil H. Paranjape       %
%                                                        %
% This work may be distributed and/or modified under the %
% conditions of the General Public License, either       %
% version 2 of this license or (at your option) any      %
% later version. The latest version of this license is   %
% in                                                     %
%   http://www.gnu.org/gpl.txt                           %
% and version 2 or later is part of all distributions    %
% of Debian.                                             %
%                                                        %
% This Current Maintainer of this work                   %
% is Kapil H. Paranjape.                                 %
%                                                        %
% If you modify this work your changing its signature    %
% with a directive of the following form will be         %
% appreciated.                                           %
%                                      kapil@imsc.res.in %
%                         http://www.imsc.res.in/~kapil  %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass{amsart}
\usepackage{hyperref}
\begin{document}
\title{A brief introduction to TeX4ht}
\author{Kapil Hari Paranjape}
\maketitle

\section{What do we have here?}
What follows is a brief introduction to the TeX4ht system
designed and currently maintained by Eitan M. Gurari. The source
for this document is in the file \verb|tex4ht_doc.tex| and can
be processed using the command \verb|htlatex tex4ht_doc.tex| as
explained below. It is hoped that such processing will prove
instructive as well.

\section{Executive summary}
TeX4ht is a system to convert TeX input into hypertext documents of
different kinds. TeX4ht operates on input that is ``standard'' \TeX\ or
\LaTeX (but please check the last section for some differences).
This input is processed by \verb|tex| in the usual way except
that certain additional macros are loaded which create some hooks in
the output that can be used to produce the hypertext. The output is
then post-processed by the program \verb|tex4ht| which produces the
hypertext. Auxiliary files such as \verb|.css| files and image files
are produced by the program \verb|t4ht|.

Usage is simplified via the Perl script \verb|mk4ht| which can be called
directly to combine the above operations transparently. For example the
source of this document can be processed using
\begin{verbatim}
	mk4ht htlatex tex4ht_doc.tex
\end{verbatim}
This will produce \verb|tex4ht_doc.html| and some supplementary
files which is the HTML version of this documentation. Similarly,
\begin{verbatim}
	mk4ht xhmlatex tex4ht_doc.tex
\end{verbatim}
will produce the XML version with MATH-ML and
\begin{verbatim}
	mk4ht mzlatex tex4ht_doc.tex
\end{verbatim}
will produce MATH-ML which uses fonts that are rendered well via the
``Gecko'' engine of \verb|mozilla|. Additional such commands are
\begin{verbatim}
	mk4ht oolatex tex4ht_doc.tex
\end{verbatim}
to a format that can be read by \verb|OpenOffice| and 
\begin{verbatim}
	mk4ht dblatex tex4ht_doc.tex
\end{verbatim}
for DocBook and
\begin{verbatim}
	mk4ht teilatex tex4ht_doc.tex
\end{verbatim}
for TEI format XML output.
The broad structure of the \verb|mk4ht| command-line is 
\begin{verbatim}
	mk4ht #1 #2 #3 #4 #5
\end{verbatim}
The first argument is the type of conversion required. Using
\verb|mk4ht| without arguments lists the conversions available. The
second argument is the name of the file that is to be processed. The
third, fourth and fifth arguments are optional and are described
is some detail below.

The rest of this document introduces the system in a little more detail.
See\ \cite{authdoc} and \cite{website} for authoritative information.
In the first following section (Section~\ref{style}) we examine the
options for modifying the way in which \TeX\ processes the source;
specifically these can be thought of as options for the macros in
\verb|tex4ht.sty|. The next section (Section~\ref{postproc}) deals
with the post-processing that converts \TeX's output into hypertext.
The final section (Section~\ref{supple}) shows how one can change the
way the system generates the supplementary files like images and
style-sheets for the hypertext output.

This document is assumes that the reader has some familiarity with
the \TeX\ and \LaTeX\ systems; see \cite{tex} and \cite{latex} for
more information.

\section{Options for Styles}\label{style}
Options for \TeX\ and \LaTeX\ processing can be added as the first
optional argument (\verb|#3| above) to the \verb|mk4ht| command. For
example, the command
\begin{verbatim}
	mk4ht xhmlatex tex4ht_doc.tex
\end{verbatim}
is in fact similar\footnote{The differences lie in the font files
chosen as described in section~\ref{postproc}} to the command
\begin{verbatim}
	mk4ht htlatex tex4ht_doc.tex "xhtml,mathml"
\end{verbatim}
Similarly,
\begin{verbatim}
	mk4ht oolatex tex4ht_doc.tex
\end{verbatim}
is in fact similar to the command
\begin{verbatim}
	mk4ht htlatex tex4ht_doc.tex "xhtml,ooffice"
\end{verbatim}
In most cases this list of options begins with \verb|html| or
\verb|xhtml|. Additional options available can be found by searching 
for the string \verb|--- Note ---| at the start of a line in the
resulting log file.  For example
\begin{verbatim}
	mk4ht htlatex tex4ht_doc.tex 
	grep -A 1 '^--- Note ---' tex4ht_doc.log
\end{verbatim}
will list all the available options for \verb|html| conversion.

When this list of options does not start with \verb|html| or
\verb|xhtml| then the system looks for a file with the name given by
the first option and the \verb|.cfg| extension. The simplest use of
this feature is as follows. Create a file called \verb|bgimage.cfg|
containing the lines
\begin{verbatim}
	\Preamble{html}
	\begin{document}
	\Css{BODY { background-image : url(background.png); }}
	\EndPreamble
\end{verbatim}
After this 
\begin{verbatim}
	mk4ht htlatex tex4ht_doc.tex "bgimage"
\end{verbatim}
will add an additional line to \verb|tex4ht_doc.css| incorporating
the image \verb|background.png|. See the main documentation
\cite{authdoc} for more details on creating configuration files.

\section{Post processing}\label{postproc}
The optional arguments \verb|#4| and \verb|#5| refer to options for
the \verb|tex4ht| and \verb|t4ht| commands respectively. Both these
commands make use of the configuration file \verb|tex4ht.env| (which
may be over-ridden by \verb|.tex4ht| in the current directory or the
user's home directory). This configuration file is called the
``environment file'' in the main documentation \cite{authdoc} in
order to avoid confusing it with the configuration file described in
the previous section.

The program \verb|tex4ht| has to look for ``font descriptions'' that
describe how various non-standard glyphs are to be ``rendered'' in
hypertext. The TeX4ht system provides a number of possibilities like
using Unicode or fonts suited to the Gecko engine of the
Mozilla browser and so on. So the command
\begin{verbatim}
	mk4ht mzlatex tex4ht_doc.tex
\end{verbatim}
is almost\footnote{There is an additional option as explained in
section~\ref{supple} below.} equivalent to
\begin{verbatim}
	mk4ht htlatex tex4ht_doc.tex "xhtml,mozilla" "-cmozhtf"
\end{verbatim}
The \verb|-c<tagname>| option for \verb|tex4ht| picks up the tagged
section from the \verb|tex4ht.env| environment file. Any other
command-line option of \verb|tex4ht| can also be used as part of
\verb|#4| which is just a space separated list of options for this
command.

\section{Creating Supplementary Files}\label{supple}
The final step of conversion is the creation of supplementary files
like image files for formulae and equations like
\[ \frac{x^n-1}{x-1} = \sum_{i=0}^{n-1} x^i \]
which is the rendering of the \LaTeX\ input string
\begin{verbatim}
	\[ \frac{x^n-1}{x-1} = \sum_{i=0}^{n-1} x^i \]
\end{verbatim}
In most cases such \TeX\ constructions can only be rendered as
images. The \verb|tex4ht| program creates a series of instructions
for the \verb|t4ht| program in a \verb|.lg| file. The latter carries
out these instructions by making use of external programs like
\verb|dvipng| or \verb|convert| to create these images. The most
useful option in the argument list \verb|#5| is \verb|-p| which
prevents images from being generated. Another useful option is
\verb|-cvalidate| which causes the net output to be validated using
an external validation program such as \verb|xmllint|. All the
options in the argument list \verb|#5| are passed on \verb|t4ht|.

\section{Some difference between TeX4ht and TeX}
We document some differences between the systems. For more up-to-date
information please see the author's documentation\cite{authdoc}.

\subsection{Regarding filenames}
In short, do {\em not} use special characters in your filenames;
ideally stick with filenames which are composed of standard ASCII
alphanumerics wherever possible. Some explanations follow.

\TeX\ nowadays accepts files with names that contain all manner of
characters and so it is natural to imagine that TeX4ht will do so to.
However, one has to be concerned with the filenames used in output as
well as those used for input. Since the latter will appear in
URL's that will appear within the hypertext using special characters
will cause hyperlinks to break. Thus TeX4ht does not currently behave
well if special characters are used in input file names.

\subsection{Extra braces required}
In short, when in doubt enclosed sub- and super- scripts in braces if
they are longer than a single character.

In this respect the syntax of the TeX language that is accepted by
TeX4ht is stricter than that accepted by \TeX\ and \LaTeX. 

\begin{thebibliography}{00}
\bibitem[1]{authdoc}
\url{http://www.cse.ohio-state.edu/~gurari/mn.html}
The authoritative documentation maintained by Eitan M. Gurari.
\bibitem[2]{website}
\url{http://www.cse.ohio-state.edu/~gurari}
Eitan M.~Gurari's web page that discusses related projects.
\bibitem[3]{tex}
\url{http://www.tug.org/}
The \TeX\ User's group primary web site.
\bibitem[4]{latex}
\url{http://www.latex-project.org/}
The \LaTeX\ project's primary web site.
\end{thebibliography}
\end{document}