Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > d9c1887ff364dc87e282490223567c41 > files > 125

ocaml-pxp-1.2.1-1mdv2010.0.i586.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<link rel="stylesheet" href="style.css" type="text/css">
<meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">
<link rel="Start" href="index.html">
<link rel="previous" href="Intro_extensions.html">
<link rel="next" href="Intro_events.html">
<link rel="Up" href="index.html">
<link title="Index of types" rel=Appendix href="index_types.html">
<link title="Index of exceptions" rel=Appendix href="index_exceptions.html">
<link title="Index of values" rel=Appendix href="index_values.html">
<link title="Index of class methods" rel=Appendix href="index_methods.html">
<link title="Index of classes" rel=Appendix href="index_classes.html">
<link title="Index of class types" rel=Appendix href="index_class_types.html">
<link title="Index of modules" rel=Appendix href="index_modules.html">
<link title="Index of module types" rel=Appendix href="index_module_types.html">
<link title="Pxp_types" rel="Chapter" href="Pxp_types.html">
<link title="Pxp_document" rel="Chapter" href="Pxp_document.html">
<link title="Pxp_dtd" rel="Chapter" href="Pxp_dtd.html">
<link title="Pxp_tree_parser" rel="Chapter" href="Pxp_tree_parser.html">
<link title="Pxp_core_types" rel="Chapter" href="Pxp_core_types.html">
<link title="Pxp_ev_parser" rel="Chapter" href="Pxp_ev_parser.html">
<link title="Pxp_event" rel="Chapter" href="Pxp_event.html">
<link title="Pxp_dtd_parser" rel="Chapter" href="Pxp_dtd_parser.html">
<link title="Pxp_codewriter" rel="Chapter" href="Pxp_codewriter.html">
<link title="Pxp_marshal" rel="Chapter" href="Pxp_marshal.html">
<link title="Pxp_yacc" rel="Chapter" href="Pxp_yacc.html">
<link title="Pxp_reader" rel="Chapter" href="Pxp_reader.html">
<link title="Intro_trees" rel="Chapter" href="Intro_trees.html">
<link title="Intro_extensions" rel="Chapter" href="Intro_extensions.html">
<link title="Intro_namespaces" rel="Chapter" href="Intro_namespaces.html">
<link title="Intro_events" rel="Chapter" href="Intro_events.html">
<link title="Intro_resolution" rel="Chapter" href="Intro_resolution.html">
<link title="Intro_getting_started" rel="Chapter" href="Intro_getting_started.html">
<link title="Intro_advanced" rel="Chapter" href="Intro_advanced.html">
<link title="Intro_preprocessor" rel="Chapter" href="Intro_preprocessor.html">
<link title="Example_readme" rel="Chapter" href="Example_readme.html"><link title="Namespaces" rel="Section" href="#1_Namespaces">
<link title="Namespace URI's and prefixes" rel="Subsection" href="#2_NamespaceURIsandprefixes">
<link title="Example for prefix normalization" rel="Subsection" href="#2_Exampleforprefixnormalization">
<link title="Getting more details of namespaces" rel="Subsection" href="#2_Gettingmoredetailsofnamespaces">
<title>PXP Reference : Intro_namespaces</title>
</head>
<body>
<div class="navbar"><a href="Intro_extensions.html">Previous</a>
&nbsp;<a href="index.html">Up</a>
&nbsp;<a href="Intro_events.html">Next</a>
</div>
<center><h1>Intro_namespaces</h1></center>
<br>
<br>
This text explains how PXP deals with the optional namespace
declarations in XML text.
<p>

<a name="1_Namespaces"></a>
<h1>Namespaces</h1>
<p>

PXP supports namespaces (but they have to be explicitly enabled). 
In order to simplify the handling
of namespace-aware documents PXP applies a transformation to the document
which is called "prefix normalization". This transformation ensures that every
namespace prefix uniquely identifies a namespace throughout the whole document.
<p>

<a name="3_Linkstootherdocumentation"></a>
<h3>Links to other documentation</h3>
<p>
<ul>
<li><a href="Intro_getting_started.html#namespaces"><i>Namespaces</i></a></li>
<li><a href="Pxp_dtd.namespace_manager.html"><code class="code"><span class="constructor">Pxp_dtd</span>.namespace_manager</code></a></li>
<li><a href="Pxp_dtd.html#VALcreate_namespace_manager"><code class="code"><span class="constructor">Pxp_dtd</span>.create_namespace_manager</code></a></li>
<li><a href="Pxp_dtd.namespace_scope.html"><code class="code"><span class="constructor">Pxp_dtd</span>.namespace_scope</code></a></li>
<li><a href="Pxp_dtd.html#VALcreate_namespace_scope"><code class="code"><span class="constructor">Pxp_dtd</span>.create_namespace_scope</code></a></li>
<li>Trees and namespaces: <a href="Intro_trees.html#access"><i>Access methods</i></a>, see the namespace subsection</li>
<li><a href="Intro_advanced.html#irrnodes"><i>Irregular nodes: namespace nodes and attribute nodes</i></a></li>
<li><a href="Intro_events.html#namespaces"><i>Events and namespaces</i></a></li>
</ul>

<a name="2_NamespaceURIsandprefixes"></a>
<h2>Namespace URI's and prefixes</h2>
<p>

A namespace is identified by a namespace URI (e.g. something like
"http://company.org/namespaces/project1" - note that this URI is simply
processed as string, and never looked up by an HTTP access). For
brevity of formulation, one has to define a so-called namespace prefix
for such a URI. For example:
<p>

<pre></pre><code class="code">&nbsp;&lt;x:q&nbsp;xmlns:x=<span class="string">"http://company.org/namespaces/project1"</span>&gt;...&lt;/q&gt;&nbsp;</code><pre></pre>
<p>

The "xmlns:x" attribute is special, and declares that for this
subtree the prefix "x" is to be used as replacement for the long
URI. Here, "x:q" denotes that the element "q" in this namespace "x"
is meant.
<p>

The problem is now that the URI defines the namespace, and not the
prefix. In another subtree you may want to use the prefix "y" for the
same namespace. This has always made it difficult to deal with namespaces
in XML-processing software.
<p>

PXP, however, performs prefix normalization before it returns the
tree. This means that all prefixes are changed to a norm prefix for
the namespace. This can be the first prefix used for the namespace,
or a prefix declared with a PXP extension, or a programmatically
declared binding of the norm prefix to the namespace.
<p>

In order to use the PXP implementation of namespaces, one has to
set <code class="code">enable_namespace_processing</code> in the parser configuration, and
to use namespace-aware node implementations. If you don't use extended
node trees, this means to use <a href="Pxp_tree_parser.html#VALdefault_namespace_spec"><code class="code"><span class="constructor">Pxp_tree_parser</span>.default_namespace_spec</code></a>
instead of <a href="Pxp_tree_parser.html#VALdefault_spec"><code class="code"><span class="constructor">Pxp_tree_parser</span>.default_spec</code></a>. A good starting point
to enable all that:
<p>

<pre></pre><code class="code">&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;nsmng&nbsp;=&nbsp;<span class="constructor">Pxp_dtd</span>.create_namespace_manager()<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;config&nbsp;=&nbsp;<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;<span class="constructor">Pxp_types</span>.default_config&nbsp;<span class="keyword">with</span><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;enable_namespace_processing&nbsp;=&nbsp;<span class="constructor">Some</span>&nbsp;nsmng<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;source&nbsp;=&nbsp;...<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;spec&nbsp;=&nbsp;<span class="constructor">Pxp_tree_parser</span>.default_namespace_spec<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;doc&nbsp;=&nbsp;<span class="constructor">Pxp_tree_parser</span>.parse_document_entity&nbsp;config&nbsp;source&nbsp;spec<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;root&nbsp;=&nbsp;doc<span class="keywordsign">#</span>root<br>
</code><pre></pre>
<p>

The namespace-aware implementations of the <code class="code">node</code> class type define
additional namespace methods like <code class="code">namespace_uri</code> (see
<a href="Pxp_document.node.html#METHODnamespace_uri"><code class="code"><span class="constructor">Pxp_document</span>.node.namespace_uri</code></a>). (Although you also could direct
the parser to create non-namespace-aware nodes, this does not make
much sense, as you do not get these special access methods then.)
<p>

The method <code class="code">namespace_scope</code> (see
<a href="Pxp_document.node.html#METHODnamespace_scope"><code class="code"><span class="constructor">Pxp_document</span>.node.namespace_scope</code></a>) allows one to get more
information what happened during prefix normalization. In particular,
it is possible to find out the original prefix in the XML text (which
is also called <b>display prefix</b>), before it was mapped to the
normalized prefix.  The <code class="code">namespace_scope</code> method returns a
<a href="Pxp_dtd.namespace_scope.html"><code class="code"><span class="constructor">Pxp_dtd</span>.namespace_scope</code></a> object with additional lookup methods.
<p>

<a name="2_Exampleforprefixnormalization"></a>
<h2>Example for prefix normalization</h2>
<p>

In the following XML snippet the prefix "h" is declared as a shorthand
for the XHTML namespace:
<p>

<pre></pre><code class="code">&lt;h:html&nbsp;xmlns:h=<span class="string">"http://www.w3.org/1999/xhtml"</span>&gt;&nbsp;<br>
&nbsp;&nbsp;&lt;h:head&gt;<br>
&nbsp;&nbsp;&nbsp;&nbsp;&lt;h:title&gt;<span class="constructor">Virtual</span>&nbsp;<span class="constructor">Library</span>&lt;/h:title&gt;&nbsp;<br>
&nbsp;&nbsp;&lt;/h:head&gt;&nbsp;<br>
&nbsp;&nbsp;&lt;h:body&gt;&nbsp;<br>
&nbsp;&nbsp;&nbsp;&nbsp;&lt;h:p&gt;<span class="constructor">Moved</span>&nbsp;<span class="keyword">to</span>&nbsp;&lt;h:a&nbsp;href=<span class="string">"http://vlib.org/"</span>&gt;vlib.org&lt;/h:a&gt;.&lt;/h:p&gt;&nbsp;<br>
&nbsp;&nbsp;&lt;/h:body&gt;&nbsp;<br>
&lt;/h:html&gt;<br>
</code><pre></pre>
<p>

In this example, normalization changes nothing, because the prefix
"h" has the same meaning thoughout the whole document. However, keep
in mind that every author of XHTML documents can freely choose the
prefix to use.
<p>

The XML standard gives the author of the document even the freedom to
change the meaning of a prefix at any time. For example, here the
prefix "x" is changed in the inner node:
<p>

<pre></pre><code class="code">&lt;x:address&nbsp;xmlns:x=<span class="string">"http://addresses.org"</span>&gt;<br>
&nbsp;&nbsp;&lt;x:name&nbsp;xmlns:x=<span class="string">"http://names.org"</span>&gt;<br>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="constructor">Gerd</span>&nbsp;<span class="constructor">Stolpmann</span><br>
&nbsp;&nbsp;&lt;/x:name&gt;<br>
&lt;/x:address&gt;<br>
</code><pre></pre>
<p>

In the outer node the prefix "x" is connected with the
"http://addresses.org" namespace, but in the inner node it is
connected with "http://names.org".
<p>

After normalization, the prefixes would look as follows:
<p>

<pre></pre><code class="code">&lt;x:address&nbsp;xmlns:x=<span class="string">"http://addresses.org"</span>&gt;<br>
&nbsp;&nbsp;&lt;x1:name&nbsp;xmlns:x1=<span class="string">"http://names.org"</span>&gt;<br>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="constructor">Gerd</span>&nbsp;<span class="constructor">Stolpmann</span><br>
&nbsp;&nbsp;&lt;/x1:name&gt;<br>
&lt;/x:address&gt;<br>
</code><pre></pre>
<p>

In order to avoid overridden prefixes, the prefix in the inner node
was changed to "x1" (for type theorists: think of alpha conversion).
<p>

The idea of prefix normalization is to simplify how programs can match
against element and attribute names. It is possible to configure the
normalizer so that certain prefixes are used for certain URI's.
In this example, we could direct the normalizer to use the prefixes
"addr" and "nm" instead of the quite arbitrary strings "x" and "x1":
<p>

<pre></pre><code class="code">dtd&nbsp;<span class="keywordsign">#</span>&nbsp;namespace_manager&nbsp;<span class="keywordsign">#</span>&nbsp;add_namespace&nbsp;<span class="string">"addr"</span>&nbsp;<span class="string">"http://addresses.org"</span>;<br>
dtd&nbsp;<span class="keywordsign">#</span>&nbsp;namespace_manager&nbsp;<span class="keywordsign">#</span>&nbsp;add_namespace&nbsp;<span class="string">"nm"</span>&nbsp;<span class="string">"http://names.org"</span>;<br>
</code><pre></pre>
<p>

For this to work you need access to the <code class="code">dtd</code> object before the parser
actually starts it work. The parsing functions in <a href="Pxp_tree_parser.html"><code class="code"><span class="constructor">Pxp_tree_parser</span></code></a>
have the special hook <code class="code">transform_dtd</code> that is called at the right
moment, and allows the program to enter such special configurations 
into the DTD object. The resulting program could look then like:
<p>

<pre></pre><code class="code">&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;nsmng&nbsp;=&nbsp;<span class="constructor">Pxp_dtd</span>.create_namespace_manager()<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;config&nbsp;=&nbsp;<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;<span class="constructor">Pxp_types</span>.default_config&nbsp;<span class="keyword">with</span><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;enable_namespace_processing&nbsp;=&nbsp;<span class="constructor">Some</span>&nbsp;nsmng<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;source&nbsp;=&nbsp;...<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;spec&nbsp;=&nbsp;<span class="constructor">Pxp_tree_parser</span>.default_namespace_spec<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;transform_dtd&nbsp;dtd&nbsp;=<br>
&nbsp;&nbsp;&nbsp;&nbsp;dtd&nbsp;<span class="keywordsign">#</span>&nbsp;namespace_manager&nbsp;<span class="keywordsign">#</span>&nbsp;add_namespace&nbsp;<span class="string">"addr"</span>&nbsp;<span class="string">"http://addresses.org"</span>;<br>
&nbsp;&nbsp;&nbsp;&nbsp;dtd&nbsp;<span class="keywordsign">#</span>&nbsp;namespace_manager&nbsp;<span class="keywordsign">#</span>&nbsp;add_namespace&nbsp;<span class="string">"nm"</span>&nbsp;<span class="string">"http://names.org"</span>;<br>
&nbsp;&nbsp;&nbsp;&nbsp;dtd<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;doc&nbsp;=&nbsp;<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="constructor">Pxp_tree_parser</span>.parse_document_entity&nbsp;~transform_dtd&nbsp;config&nbsp;source&nbsp;spec<br>
&nbsp;&nbsp;<span class="keyword">let</span>&nbsp;root&nbsp;=&nbsp;doc<span class="keywordsign">#</span>root<br>
</code><pre></pre>
<p>

Alternatively, it is also possible to put special processing instructions
into the DTD:
<p>

<pre></pre><code class="code">&lt;?pxp:dtd&nbsp;namespace&nbsp;prefix=<span class="string">"addr"</span>&nbsp;uri=<span class="string">"http://addresses.org"</span><span class="keywordsign">?&gt;</span><br>
&lt;?pxp:dtd&nbsp;namespace&nbsp;prefix=<span class="string">"nm"</span>&nbsp;uri=<span class="string">"http://names.org"</span><span class="keywordsign">?&gt;</span><br>
</code><pre></pre>
<p>

The advantage of configuring specific normprefixes is that one can now
use them directly in programs, e.g. for matching:
<p>

<pre></pre><code class="code">&nbsp;&nbsp;<span class="keyword">match</span>&nbsp;node<span class="keywordsign">#</span>node_type&nbsp;<span class="keyword">with</span><br>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="keywordsign">|</span>&nbsp;<span class="constructor">T_element</span>&nbsp;<span class="string">"addr:address"</span>&nbsp;<span class="keywordsign">-&gt;</span>&nbsp;...<br>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="keywordsign">|</span>&nbsp;<span class="constructor">T_element</span>&nbsp;<span class="string">"nm:name"</span>&nbsp;<span class="keywordsign">-&gt;</span>&nbsp;...<br>
</code><pre></pre>
<p>

<a name="2_Gettingmoredetailsofnamespaces"></a>
<h2>Getting more details of namespaces</h2>
<p>

There are two additional objects that are relevant. First, there is a
namespace manager for the whole tree. This object gathers all namespace
URI's up that occur in the XML text, and decides which normprefixes
are associated with them: <a href="Pxp_dtd.namespace_manager.html"><code class="code"><span class="constructor">Pxp_dtd</span>.namespace_manager</code></a>.
<p>

Second, there is the namespace scope. An XML tree may have a lot of such
objects. A new scope object is created whenever new namespaces are
introduced, i.e. when there are "xmlns" declarations. The scope object
has a pointer to the scope object for the surrounding XML text. Scope
objects are documented here: <a href="Pxp_dtd.namespace_scope.html"><code class="code"><span class="constructor">Pxp_dtd</span>.namespace_scope</code></a>.
<p>

Some examples (when <code class="code">n</code> is a node):
<p>

<ul>
<li>To find out which normprefix is used for a namespace URI, use
     <pre></pre><code class="code">&nbsp;n&nbsp;<span class="keywordsign">#</span>&nbsp;namespace_manager&nbsp;<span class="keywordsign">#</span>&nbsp;get_normprefix&nbsp;uri&nbsp;</code><pre></pre> </li>
<li>To find out the reverse, i.e. which URI is represented by a certain
     normprefix, use
     <pre></pre><code class="code">&nbsp;n&nbsp;<span class="keywordsign">#</span>&nbsp;namespace_manager&nbsp;<span class="keywordsign">#</span>&nbsp;get_primary_uri&nbsp;prefix&nbsp;</code><pre></pre> </li>
<li>To find out which namespace URI is meant by a display prefix, i.e.
     the prefix as it occurs literally in the XML text:
     <pre></pre><code class="code">&nbsp;n&nbsp;<span class="keywordsign">#</span>&nbsp;namespace_scope&nbsp;<span class="keywordsign">#</span>&nbsp;uri_of_display_prefix&nbsp;prefix&nbsp;</code><pre></pre> </li>
</ul>

<br>
</body></html>