<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <link rel="stylesheet" href="style.css" type="text/css"> <link rel="Start" href="index.html"> <link rel="previous" href="Glade.html"> <link rel="next" href="GlGtk.html"> <link rel="Up" href="index.html"> <title>LablGTK : Xml_lexer</title> </head> <body> <div class="navbar"><a href="Glade.html">Previous</a> <a href="index.html">Up</a> <a href="GlGtk.html">Next</a> </div> <center><h1>Module <a href="type_Xml_lexer.html">Xml_lexer</a></h1></center> <br> <pre><span class="keyword">module</span> Xml_lexer: <code class="code">sig</code> <a href="Xml_lexer.html">..</a> <code class="code">end</code></pre>Simple XML lexer<br> <hr width="100%"> <br> This module provides an <code class="code">ocamllex</code> lexer for XML files. It only supports the most basic features of the XML specification. <p> The lexer altogether ignores the following 'events': comments, processing instructions, XML prolog and doctype declaration. <p> The predefined entities (<code class="code">&amp;</code>, <code class="code">&lt;</code>, etc.) are supported. The replacement text for other entities whose entity value consist of character data can be provided to the lexer (see <a href="Xml_lexer.html#VALentities"><code class="code">Xml_lexer.entities</code></a>). Internal entities declarations are <em>not</em> taken into account (the lexer just skips the doctype declaration). <p> <code class="code">CDATA</code> sections and character references are supported. <p> See <a href="Xml_lexer.html#VALstrip_ws"><code class="code">Xml_lexer.strip_ws</code></a> about whitespace handling.<br> <br> <a name="3_Errorreporting"></a> <h3>Error reporting</h3><br> <br><code><span class="keyword">type</span> <a name="TYPEerror"></a><code class="type"></code>error = </code><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Illegal_character</span> <span class="keyword">of</span> <code class="type">char</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Bad_entity</span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Unterminated</span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Tag_expected</span></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Attribute_expected</span></code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Other</span> <span class="keyword">of</span> <code class="type">string</code></code></td> </tr></table> <pre><span class="keyword">val</span> <a name="VALerror_string"></a>error_string : <code class="type"><a href="Xml_lexer.html#TYPEerror">error</a> -> string</code></pre><pre><span class="keyword">exception</span> <a name="EXCEPTIONError"></a>Error <span class="keyword">of</span> <code class="type"><a href="Xml_lexer.html#TYPEerror">error</a> * int</code></pre> <div class="info"> This exception is raised in case of an error during the parsing. The <code class="code">int</code> argument indicates the character position in the buffer. Note that some non-conforming XML documents might not trigger an error.<br> </div> <br> <a name="3_API"></a> <h3>API</h3><br> <br><code><span class="keyword">type</span> <a name="TYPEtoken"></a><code class="type"></code>token = </code><table class="typetable"> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Tag</span> <span class="keyword">of</span> <code class="type">string * (string * string) list * bool</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" ><code class="code">Tag (name, attributes, empty)</code> denotes an opening tag with the specified <code class="code">name</code> and <code class="code">attributes</code>. If <code class="code">empty</code>, then the tag ended in "/>", meaning that it has no sub-elements.</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Chars</span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >Some text between the tags</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">Endtag</span> <span class="keyword">of</span> <code class="type">string</code></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >A closing tag</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr> <tr> <td align="left" valign="top" > <code><span class="keyword">|</span></code></td> <td align="left" valign="top" > <code><span class="constructor">EOF</span></code></td> <td class="typefieldcomment" align="left" valign="top" ><code>(*</code></td><td class="typefieldcomment" align="left" valign="top" >End of input</td><td class="typefieldcomment" align="left" valign="bottom" ><code>*)</code></td> </tr></table> <div class="info"> The type of the XML document elements<br> </div> <pre><span class="keyword">val</span> <a name="VALstrip_ws"></a>strip_ws : <code class="type">bool Pervasives.ref</code></pre><div class="info"> Whitespace handling: if <code class="code">strip_ws</code> is <code class="code">true</code> (the default), whitespaces next to a tag are ignored. Character data consisting only of whitespaces is thus suppressed (i.e. <code class="code">Chars ""</code> tokens are skipped).<br> </div> <pre><span class="keyword">val</span> <a name="VALentities"></a>entities : <code class="type">(string * string) list Pervasives.ref</code></pre><div class="info"> An association list of entities definitions. Initially, it contains the predefined entities (<code class="code"> ["amp", "&"; "lt", "<" ...] </code>).<br> </div> <pre><span class="keyword">val</span> <a name="VALtoken"></a>token : <code class="type">Lexing.lexbuf -> <a href="Xml_lexer.html#TYPEtoken">token</a></code></pre><div class="info"> The entry point of the lexer.<br> <b>Raises</b> <code>Error</code> in case of an invalid XML document<br> <b>Returns</b> the next token in the buffer<br> </div> </body></html>