Sophie: zope-doc-1:2.11.2-11mdv2010.0 i586

zope-doc-2.11.2-11mdv2010.0.i586.rpm

<html>
<head>
<title>Chapter 9: Using XML</title>
</head>
<body bgcolor="#FFFFFF">
<h1>Chapter 9: Using XML</h1>
<p>  XML is becoming an important data interchange format. More and more
  services are being made available over the web using XML. XML is a
  format to describe structured documents and data. Since XML is a
  widely supported standard it provides a good medium for exchanging
  structured information between systems.</p><p>  Zope supports XML on many fronts. You can generate XML from Zope
  objects thus allowing foreign systems to understand you, and you can
  import XML into Zope in order to decipher and manage it. Zope also
  supports exporting Zope objects in an XML format, and it supports
  several XML-based Internet protocols including WebDAV and XML-RPC. </p><h2>  Managing XML with XML Document</h2>
<p>    You can use XML in Zope with XML Documents. XML Documents hold XML
    content that you can upload, download, and edit with the Zope
    management interface. You can also script XML Documents using the
    Internet standard Document Object Model (DOM) API. Zope is much
    more than an XML repository, since once your XML data is in Zope
    it can take advantage of all Zope's services such as persistence,
    security, cataloging, presentation, and more.</p><h3>    Using XML Document</h3>
<p>      To create an XML Document, choose <em>XML Document</em> from the
      product add list. Then click the <em>Add</em> button. You will be taken
      to an XML Document add form as shown in Figure 9-1.</p><p>      <img src="Figures/uc.png" alt="XML Document add form">
</p><p>      The <em>Id</em> and <em>Title</em> fields allow you to specify a standard Zope
      id and title for your XML Document. The <em>File</em> field lets you
      upload an XML file from your local computer. You can create the
      XML Document by either clicking the <em>Add</em> or the <em>Add and Edit</em>
      buttons. </p><p>      After you create an XML Document you can change its XML in two
      different ways. You can edit the XML through the web as text,
      and you can manipulate the document's XML elements as Zope
      sub-objects. </p><h4>      Editing XML</h4>
<p>        To edit a document's XML go to the <em>Edit</em> view. Here you can
        change your document's XML as shown in Figure 9-2.</p><p>        <img src="Figures/uc.png" alt="Editing XML">
</p><p>        You can type XML right in your browser. If you make an error
        and enter invalid XML, Zope will complain. After you have
        changed your document's XML click the <em>Change</em> button. Zope
        does not currently validate XML against a DTD or schema. Later
        versions of XML Document will probably allow validation.</p><p>        You can also change the XML of a document by uploading an XML
        file from your local computer. Go to the <em>Upload</em> view. Here
        you can select an XML file to upload. When you upload an XML
        file, you completely replace the contents of your XML
        Document. As always, you can undo this action if you make a
        mistake. </p><h4>      Accessing Elements</h4>
<p>        An interesting feature of XML Documents is that they represent
        their contents as Zope objects. In other words, you can access
        your document's XML not just as text, but as objects. Go to
        the <em>Contents</em> view of an XML Document to see it's contents as
        objects as shown in Figure 9-3.</p><p>        <img src="Figures/uc.png" alt="XML Document Contents view">
</p><p>        As you can see your document's XML elements are represented as
        Zope objects. You can cut, copy, paste, and delete them like
        normal Zope objects. This allows you to move XML elements
        around in your document. Note that you cannot move elements
        out of your document, nor can you move other types of objects
        like Folders into your document.</p><p>        You can also rearrange the order of your document's XML
        elements using the <em>Shift Up</em> and <em>Shift Down</em> buttons. Select
        an element and click the <em>Shift Up</em> button. The element moves
        up in the list of elements. Notice the element's id may change
        as a result of shifting it. XML elements are named according
        to their element name and their position. For example the
        second <em>para</em> element has an id of <em>para-2</em>. If you move this
        element before the first <em>para</em> element, its id will change
        to <em>para-1</em>.</p><p>        If you click on an element you will then be taken to the
        element's management screen as shown in Figure 9-4.</p><p>        <img src="Figures/uc.png" alt="XML element management">
</p><p>        As you can see, elements also can have sub-elements. So not
        only can you move top-level elements around, but you can
        elements around into other elements. For example, create an
        XML Document named <em>family.xml</em> with these contents:<pre>          &lt;family&gt;

            &lt;mother&gt;
              &lt;eyes color=&quot;brown&quot;/&gt;
              &lt;ears size=&quot;small&quot;/&gt;
            &lt;/mother&gt;

            &lt;father&gt;
              &lt;eyes color=&quot;blue&quot;/&gt;
              &lt;ears size=&quot;large&quot;/&gt;
            &lt;/father&gt;        

            &lt;child/&gt;

            &lt;child/&gt;

          &lt;/family&gt;</pre>
</p><p>        Now go to the <em>Contents</em> view of the document and navigate to
        the <em>mother-1</em> element. Select the <em>eyes-1</em> element and click
        <em>Copy</em>. Now go back up to the <em>family-1</em> element and navigate
        down into the <em>child-1</em> element. Click the <em>Paste</em> button. Now
        return to the document and go to the <em>Edit</em> view. You should
        see that the <em>child</em> element now has the same <em>eyes</em>
        sub-element as the <em>mother</em> element.</p><p>        You may have noticed that elements have ids corresponding to
        their tag names. For example, the <em>family</em> tag has an id of
        <em>family-1</em>. The number following the tag name indicates the
        number of the element. For example, notice that the first
        <em>child</em> element has an id of <em>child-1</em> while the second
        <em>child</em> tag has an id of <em>child-2</em>. Since you can have more
        than one element with the same tag name, it is necessary to
        use a number in the element id to ensure that each element has
        a unique id. </p><p>        Since XML elements are Zope objects with unique ids, you can
        treat them just like other Zope objects. For example, you can
        visit an XML element at its URL. You can call acquired methods
        on elements. You can catalog elements. You can walk up to
        elements and manage them. In the course of this chapter we'll
        show you how to do all these things with XML elements.</p><h4>      Editing Elements</h4>
<p>        You may have noticed that elements have several views in
        addition to the <em>Contents</em> view. You can edit the XML of an
        element directly by going to the <em>Edit</em> view. You can also
        replace an element by going to the <em>Upload</em> view and uploading
        an XML file.</p><p>        For example, go to the <em>Edit</em> view of the <em>child</em> node from
        the last example. You should see:<pre>          &lt;eyes color=brown/&gt;</pre>
</p><p>        This is shows you the element you pasted in from another
        element. Change the contents of the element to:<pre>          &lt;eyes color=&quot;brown&quot;/&gt;
          &lt;ears size=&quot;medium&quot;/&gt;</pre>
</p><p>        Click the <em>Change</em> button. Now you can go to the <em>Contents</em>
        view and see that your element now contains an new <em>ears-1</em>
        element. You can also verify you changes by returning to the
        <em>Edit</em> view of the XML Document. You should see that your
        changes to the element are reflected in the contents of the
        document. </p><h4>      Viewing the DOM </h4>
<p>        XML Documents and elements give you a way to get a graphical
        view of their contents. Go to the <em>DOM Hierarchy</em> view to see
        tree view of your document's XML as shown in Figure 9-5.</p><p>        <img src="Figures/uc.png" alt="XML Document DOM Hierarchy">
</p><p>        You can expand and collapse the tree by clicking on the plus
        and minus signs next to individual nodes. You can also
        completely expand or collapse the tree with the links at the
        top of the screen. To manage an element click on it. In
        addition to viewing the DOM tree from an XML Document, you can
        view a portion of he DOM tree by navigating to an element and
        then going to the <em>DOM Hierarchy</em> view on that element. This
        will show you the branch of the DOM tree from your
        element. The <em>DOM Hierarchy</em> view is mostly useful as a way to
        examine the structure of your XML and quickly navigate to
        different elements.</p><h3>    The DOM API</h3>
<p>      The DOM API is the standard Internet API for querying and
      controlling XML documents. Zope supports <a href="http://www.w3.org/TR/DOM-Level-2">DOM Level
      2</a> including the traversal
      extensions as defined by the <a href="http://www.w3.org">World Wide Web
      Consortium</a>.  Consortium.  The DOM is a fairly
      complex API that defines how you can access and manipulate an
      XML document.  A complete discussion of the DOM is beyond the
      scope of this book. See Appendix A for a summary of the DOM API
      as supported by XML Document.  You can use the DOM API from
      DTML, Python, and Perl to query and change XML Documents.</p><h3>    Displaying XML with DTML</h3>
<p>      Until XSLT Methods are available, DTML Methods are your best
      choice for displaying XML Documents. You can use DTML to display
      XML Documents exactly the same way you use DTML to display other
      Zope objects. For example suppose you want an XML Document that
      describes a number of invoices. Create an XML with an id of
      <em>invoices.xml</em> with these contents:<pre>        &lt;invoices&gt;
          &lt;invoice&gt;
            &lt;number&gt;127&lt;/number&gt;
            &lt;company&gt;Acme Feedbags&lt;/company&gt;
            &lt;amount&gt;34.00&lt;/amount&gt;
            &lt;status&gt;Overdue&lt;/status&gt;
          &lt;/invoice&gt;
          &lt;invoice&gt;
            &lt;number&gt;128&lt;/number&gt;
            &lt;company&gt;Vet Tech&lt;/company&gt;
            &lt;amount&gt;55.00&lt;/amount&gt;
            &lt;status&gt;Normal&lt;/status&gt;
          &lt;/invoice&gt;
        &lt;/invoices&gt;</pre>
</p><p>      To display the invoices in HTML create a DTML Method named
      <em>viewInvoices</em> with this DTML code:<pre>        &lt;dtml-var standard_html_header&gt;

        &lt;h2&gt;Invoices&lt;/h2&gt;

        &lt;table&gt;
          &lt;tr&gt;
            &lt;th&gt;Invoice Number&lt;/th&gt;
            &lt;th&gt;Company&lt;/th&gt;
            &lt;th&gt;Amount&lt;/th&gt;
            &lt;th&gt;Status&lt;/th&gt;
          &lt;/tr&gt;
        &lt;dtml-in expr=&quot;documentElement.getElementsByTagName('invoice')&quot;&gt;
          &lt;tr&gt;
            &lt;td&gt;
              &lt;dtml-in expr=&quot;getElementsByTagName('number')&quot;&gt;
                &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;
              &lt;/dtml-in&gt;
            &lt;/td&gt;
            &lt;td&gt;
              &lt;dtml-in expr=&quot;getElementsByTagName('company')&quot;&gt;
                &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;
              &lt;/dtml-in&gt;
            &lt;/td&gt;
            &lt;td&gt;
              &lt;dtml-in expr=&quot;getElementsByTagName('amount')&quot;&gt;
                &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;
              &lt;/dtml-in&gt;
            &lt;/td&gt;
            &lt;td&gt;
              &lt;dtml-in expr=&quot;getElementsByTagName('status')&quot;&gt;
                &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;
              &lt;/dtml-in&gt;
            &lt;/td&gt;
          &lt;/tr&gt;
        &lt;/dtml-in&gt;
        &lt;/table&gt;      

        &lt;dtml-var standard_html_footer&gt;</pre>
</p><p>      You can use this method to display your XML data by going to the
      URL <em>http://localhost:8080/invoices.xml/viewInvoices</em>. The
      resulting web page is shown in Figure 9-6.</p><p>      <img src="Figures/uc.png" alt="Displaying an XML Document with DTML">
</p><p>      This DTML Method is rather complex since it requires so many DOM
      method calls. It loops over all the <em>invoice</em> elements using the
      <em>getElementsByTagName</em> method. For each <em>invoice</em> element it
      finds the contained <em>number</em>, <em>company</em>, <em>amount</em>, and <em>status</em>
      elements and displays their child text nodes.</p><p>      You could also choose to display each invoice on a separate web
      page. Create a DTML Method named <em>viewInvoice</em> with these
      contents:<pre>        &lt;dtml-var standard_html_header&gt;

        &lt;h2&gt;Invoice&lt;/h2&gt;

        &lt;p&gt;
        &lt;dtml-if previousSibling&gt;
        &lt;dtml-with previousSibling&gt;
        &lt;a href=&quot;&amp;dtml-absolute_url;/viewInvoice&quot;&gt;Previous invoice&lt;/a&gt;
        &lt;/dtml-with&gt;
        &lt;/dtml-if&gt;

        &lt;dtml-if nextSibling&gt;
        &lt;dtml-with nextSibling&gt;
        &lt;a href=&quot;&amp;dtml-absolute_url;/viewInvoice&quot;&gt;Next invoice&lt;/a&gt;
        &lt;/dtml-with&gt;
        &lt;/dtml-if&gt;
        &lt;/p&gt;

        &lt;table&gt;
        &lt;tr&gt;
          &lt;th&gt;Number&lt;/th&gt;
          &lt;td&gt;&lt;dtml-in expr=&quot;getElementsByTagName('number')&quot;&gt;
        &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;&lt;/dtml-in&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;th&gt;Company&lt;/th&gt;
          &lt;td&gt;&lt;dtml-in expr=&quot;getElementsByTagName('company')&quot;&gt;
        &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;&lt;/dtml-in&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;th&gt;Amount&lt;/th&gt;
          &lt;td&gt;&lt;dtml-in expr=&quot;getElementsByTagName('amount')&quot;&gt;
        &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;&lt;/dtml-in&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;th&gt;Status&lt;/th&gt;
          &lt;td&gt;&lt;dtml-in expr=&quot;getElementsByTagName('status')&quot;&gt;
        &lt;dtml-in childNodes&gt;&lt;dtml-var nodeValue&gt;&lt;/dtml-in&gt;&lt;/dtml-in&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;/table&gt; 

        &lt;dtml-var standard_html_footer&gt;         </pre>
</p><p>      Call this method on the first invoice node by going to this URL
      <em>http://localhost:8080/invoices/invoices-1/invoice-1/viewInvoice</em>. You
      should see a web page as shown in Figure 9-7.</p><p>      <img src="Figures/uc.png" alt="Displaying an XML Element with DTML">
</p><p>      An interesting thing to notice is how this display method
      creates links between invoice elements. It checks the
      <em>previousSibling</em> and <em>nextSibling</em> DOM attributes. If they are
      present, it uses the <em>absolute_url</em> method to create a link to
      the elements.</p><p>      It's a fairly common pattern when working with a large XML
      Document to create templates for elements rather than for the
      complete document. Each element template can include
      navigational links to allows you to move between
      elements.</p><h3>    Using XML with Python and Perl</h3>
<p>      You can use Python and Perl to query and change XML
      Documents. You can call DOM methods and access DOM attributes on
      individual elements of an XML Document. For example, create an
      XML Document with an id of <em>addressbook.xml</em> and these contents:<pre>        &lt;addressbook&gt;
          &lt;item&gt;
            &lt;name&gt;Bob&lt;/name&gt;
            &lt;address&gt;2118 Wildhare st.&lt;/address&gt;
          &lt;/item&gt;
          &lt;item&gt;
            &lt;name&gt;Randolf&lt;/name&gt;
            &lt;address&gt;13 E. Roundway&lt;/address&gt;
          &lt;/item&gt;
        &lt;/addressbook&gt;</pre>
</p><p>      You can query this XML Document in Python in a number of
      ways. Here is a Script that does some testing
      on the XML Document:<pre>        ## Script (Python) &quot;query&quot;
        ##
        import string
        # get the XML Document, must use getattr since
        # the id is not a legal Python identifier
        doc=getattr(context, 'addressbook.xml')

        # get the document element
        book=doc.documentElement

        # count items, assuming all children are items
        print &quot;Number of items&quot;, len(book.childNodes)

        # get names of items
        names=[]
        for item in book.childNodes:
            # assumes first child is name
            name=item.firstChild 
            # assumes name has one child which is a text node  
            names.append(name.firstChild.nodeValue)
        print &quot;Names &quot;, string.join(names, &quot;,&quot;)
        return printed</pre>
</p><p>      Querying an XML Document with DOM may be a bit tedious, but it's
      relatively straight forward. You can also write scripts in
      Python and Perl that can query elements of XML Documents. For
      example, here's a Script that expects to be
      called on an <em>item</em> element. It returns the content of the
      <em>name</em> sub-element:<pre>        ## Script (Python) &quot;itemName&quot;
        ##
        # context is assumed to be an item element
        return context.firstChild.firstChild.nodeValue</pre>
</p><p>      You can call this method on the first item in the
      <em>addressbook.xml</em> document by going to this URL
      <em>http://localhost:8080/addressbook.xml/addressbook-1/item-1/itemName</em>.
      You could also call this method on an element from another DTML
      method or a Script. For example, in an earlier section you saw how
      you can call a DTML Method directly on an element to display
      it. The DTML Method could call Scripts on the element in
      order to query the element in ways that would be difficult to do
      from DTML.</p><p>      In addition to querying, Scripts excel at
      modifying XML. You can call DOM methods to edit elements and
      move them around. For example, here's a Script to add a
      new <em>item</em> element to the <em>addressbook.xml</em> XML Document:<pre>        ## Script (Python) &quot;addItem&quot;
        ##parameters=name, address
        ##bind context=doc
        ##
        # call this script on an XML Document

        # create item element and its sub-elements
        item=doc.createElement('item')

        elname=doc.createElement('name')
        elname.appendChild(doc.createTextNode(name))
        item.appendChild(elname)

        eladdr=doc.createElement('address')
        eladdr.appendChild(doc.createTextNode(address))
        item.appendChild(eladdr)

        # add complete item to addressbook
        book=doc.documentElement
        book.appendChild(item)    </pre>
</p><p>      This script creates a new <em>item</em> element along with its
      sub-elements, <em>name</em> and <em>address</em>. It then inserts the <em>item</em>
      element into the <em>addressbook</em> element.</p><p>      Here's another example using two Scripts to
      rearrange the <em>item</em> elements in alphabetical order:<pre>        ## Script (Python) &quot;compareItems&quot;
        ##parameters=x, y
        ##
        &quot;&quot;&quot;
        Compares two item elements alphabetically. Returns -1,
        0, or 1 to indicate less, equal, and greater.

        Used by the sortItems script to sort a list of address
        elements.
        &quot;&quot;&quot; 
        return cmp(x.firstChild.firstChild.nodeValue,
                   y.firstChild.firstChild.nodeValue)

        ## Script (Python) &quot;sortItems&quot;
        ##
        &quot;&quot;&quot;
        Sorts the address elements of an XML Document

        Call this method on an XML Document
        &quot;&quot;&quot;
        # remove item elements
        items=[]
        book=context.documentElement
        for item in book.childNodes:            
            book.removeChild(item)
            items.append(child)

        # sort item elements
        items.sort(context.compareItems)            

        # insert them back
        for item in items:
            book.appendChild(item)

        This script works by removing the *item* elements one by one
        from the *addressbook* element. It builds a Python list of
        items, and when it finishes removing and sorting the *items*,
        it adds them back to the *addressbook* in sorted order.

        Rather than adding items to your address book and then sorting
        the entire thing, it would be more efficient to add items in
        the correct order in the first place. That way your address
        book is always in order. Here's a revision of the *addItem*
        script that adds items in the correct place so that the
        address book stays sorted::

        ## Script (Python) &quot;addItem&quot;
        ##parameters=name, address
        ##bind context=doc
        # call this method on an XML Document

        # create item element and its sub-elements
        item=doc.createElementNS('', 'item')

        elname=doc.createElementNS('', 'name')
        elname.appendChild(doc.createTextNode(name))
        item.appendChild(elname)

        eladdr=doc.createElementNS('', 'address')
        eladdr.appendChild(doc.createTextNode(address))
        item.appendChild(eladdr)

        # figure out where to insert item using bisect algorithm
        book=doc.documentElement
        items=book.childNodes
        lo=0
        hi=len(items)
        while lo &lt; hi:
            mid = (lo + hi) / 2
            if name &lt; items[mid].firstChild.firstChild.nodeValue:
                hi = mid
            else:
                lo = mid + 1

        # insert item
        if lo == len(childNodes):
            book.appendChild(item)
        else:
            before=items[lo]
            book.insertBefore(item, before)</pre>
</p><p>      Don't worry if you don't understand how this script sorts
      items. The important point is to see that you can use Python's
      expressive logic to work on XML data.</p><p>      In addition to using Python and Perl to manipulate XML using
      DOM, you can use Python and Perl to use Zope services with XML
      data.  For example, instead of maintaining your address book as
      an XML Document, you could maintain it with Zope objects. You
      could then restrict your use of XML to importing and exporting
      data from your Zope address book. While this design may seem
      more difficult, it often proves better in practice. XML elements
      can be treated as Zope objects but often these is a mismatch
      between XML elements and application objects. For example in the
      address book example, the <em>name</em> element is a sub-object of an
      <em>item</em> element. However in your application you may decide that
      it's better to have <em>person</em> objects that contain one or more
      <em>address</em> objects. You may not be able to change your XML format
      to fit your application design since other services may expect
      XML in this format. Later in the chapter you'll see an example
      of how to import and export XML from an application without
      having to store your data as XML. For serious applications this
      approach is often better.</p><h3>    Cataloging XML Documents and Elements</h3>
<p>      One especially interesting service that Zope can provide to XML
      data is searching. Using ZCatalog you can catalog XML Documents
      and their elements. See Chapter 9 for more information on
      ZCatalog. With ZCatalog you can perform full-text searching of
      XML elements. This gives you a lot of control over XML data.</p><p>      For example suppose you have an archive of
      <a href="http://www.docbook.org/">docbook</a> XML articles. They consist of
      an <em>article</em> with a number of <em>section</em> elements. Each <em>section</em>
      element contains a <em>title</em> element, some <em>para</em> elements, and
      optionally additional <em>section</em> elements. Here's an example
      document:  <p>        <article>
          <title>The History of Haircutting</title>
          <section>
            <title>Prehistory of Barbering</title>
            <para>Before scissors and razors were invented people cut
            hair with sharp rocks. If rocks were not available a
            barbers own teeth were his next best option.</para>
          </section>
          <section>
            <title>Modern Haircutting</title>
            <para>In these enlightened times hair is most often cut
            with whirling-bladed devices attached to vacuum
            cleaners.</para>
            <section>
              <title>Modern Hairstyles</title>
              <para>Modern hairstyles favor form over function, and
              present a true challenge to today's hairstylists.</para>
            </section>
          </section>
        </article></p></p><p>      To catalog articles like this you need to create scripts that
      return the text you'd like to index. For this example, let's
      index <em>section</em> elements. This will allow you to search for text
      and find all the sections in all the articles that include the
      search terms. Probably you'll want two scripts: one to return
      the full text of a <em>section</em> element, and another to return the
      text of the section's title. You'll use the full-text script to
      index <em>section</em> objects, and the title script to get meta-data
      on the section. This will allow you to return the titles of all
      the sections that match a given query.</p><p>      Create a Script named <em>section_title</em> to
      return the text of a section's title:<pre>         ## Script (Python) &quot;section_title&quot;
         &quot;&quot;&quot;
         Text of a section element's title element.
         &quot;&quot;&quot;
         for child in context.childNodes:
             if child.tagName == 'title':
                 return child.firstChild.nodeValue</pre>
</p><p>      Now create a Script named <em>section_fulltext</em> to
      return the full text of the <em>section</em> element:<pre>         ## Script (Python) &quot;section_fulltext&quot;
         &quot;&quot;&quot;
         Full text of a section element. Does not include text of
         contained section elements.
         &quot;&quot;&quot;
         text=&quot;&quot;
         for child in context.childNodes:
             if child.tagName in ('para', 'title'):
                 text = text + &quot; &quot; + child.firstChild.nodeValue
         return text</pre>
</p><p>      This script returns text of all <em>para</em> and <em>title</em> sub-elements
      of a <em>section</em> element.</p><p>      Now that you've created both the necessary scripts, it's time to
      create a ZCatalog and catalog your articles. Create a ZCatalog
      at the same level or above the location of you articles. Name
      the catalog <em>articleCatalog</em>. Now go to the <em>Indexes</em> view and
      delete all the existing indexes. Next create a new TextIndex
      named <em>section_fulltext</em>. This tells the catalog to call the
      <em>section_fulltext</em> script on all cataloged objects and to treat
      the result as full text. Now go to the <em>MetaData Table</em> view and
      delete all the existing meta-data columns. Create a new column
      named <em>section_title</em>. This tells the catalog to call the
      <em>section_title</em> script on all cataloged objects and to store the
      result as meta-data which will be available on result
      objects. Now that you've set the indexes and meta-data it's time
      to find and catalog all the <em>section</em> elements in your
      articles. Go to the <em>Find Items to ZCatalog</em> view. In the <em>expr</em>
      field enter <em>tagName=='section'</em> and click the <em>Find</em>
      button. This tells the catalog to search for objects whose
      <em>tagName</em> attribute is equal to the string <em>section</em>. You should
      be taken to the <em>Cataloged Objects</em> view and you should see that
      the catalog now contains a list of <em>section</em> elements.</p><p>      Now you can create a search and results form for the
      catalog. Create a DTML Method inside the catalog named <em>Search</em>
      with these contents:<pre>        &lt;dtml-var standard_html_header&gt;

        &lt;h2&gt;Search Articles&lt;/h2&gt;

        &lt;form action=&quot;Results&quot;&gt;
        Search terms &lt;input type=&quot;text&quot; name=&quot;section_fulltext&quot;&gt;
        &lt;input type=&quot;submit&quot; value=&quot;Search&quot;&gt;
        &lt;/form&gt;

        &lt;dtml-var standard_html_footer&gt;</pre>
</p><p>      Next you need to create a results form that will be called by
      the search form. Create another DTML Method inside the catalog
      named <em>Results</em> with these contents:<pre>        &lt;dtml-var standard_html_header&gt;

        &lt;h2&gt;Found Articles&lt;/h2&gt;

        &lt;dtml-in searchResults&gt;
        &lt;a href=&quot;&lt;dtml-var expr=&quot;getpath(data_record_id_)&quot; url_quote&gt;&quot;&gt;&lt;dtml-var section_title&gt;&lt;/a&gt;&lt;br&gt;
        &lt;/dtml-in&gt;

        &lt;dtml-var standard_html_footer&gt;</pre>
</p><p>      Congratulations, you've implemented a fine-grained full text
      XML search. View the <em>Search</em> method to test it out. Sure enough
      it will find matching <em>section</em> element.</p><p>      Unfortunately you don't have any way to display the matching
      <em>section</em> elements yet. To complete the example you should
      create a DTML Method to display a <em>section</em> element. Better yet
      you might want to create a method to display an article that
      includes internal anchors so that you could call it with a
      section identifier to display the article beginning with a given
      section. While XSLT is probably the right choice for such a
      method, it could be done in DTML. The implementation of the
      article and section display methods is left as an exercise for
      the ambitious reader.</p><h3>    Controlling XML Parsing</h3>
<p>      XML Document gives you a fair amount of low-level control over
      how it parses XML. To tell XML Documents how to parse XML you
      can create <em>XML Parsing Option</em> objects. Right now you can
      control how Zope handles white space and storage of elements. In
      the future you'll be able to control XML validation. XML Parsing
      Options are only needed by advanced users. You can safely ignore
      them until you find that you need more control over how Zope
      parses your XML.</p><p>      Create a XML Parsing Option named <em>myOption</em>. You should see an
      add screen as shown in Figure 9-8.</p><p>      <a href="Figures/uc.png">XML Parsing Option add screen.</a></p><p>      The <em>ignoreWSTextNodes</em> option tells Zope to ignore white space
      between elements when parsing XML. You should ignore white space
      unless you have a specific reason not to. The second option
      lets you tell Zope which XML elements you want to store
      separately in the Zope database. In the <em>PersistElementTags</em>
      field you should enter the namespace and tag name for each type
      of element that you want to store separately. The reason to
      store elements separately is control performance and memory
      use. Increasing the number of persistent elements, increases the
      performance hit, but lowers the memory usage. The more
      persistent element you have the more the database needs to work
      to retrieve them. However by breaking your XML data into a
      number of persistent elements you lessen the amount of data that
      needs to be in memory because Zope can move unneeded persistent
      elements out of memory. These issues only really come into play
      when you are using very large XML Documents. In these cases
      you'll need to experiment with different settings to find the
      right trade off between speed and memory use.  For now leave the
      <em>PersistElementTags</em> field blank and click the <em>Add</em> button.</p><p>      Now whenever you create an XML Document you'll be able to
      specify your XML Parsing options. For example create an new XML
      Document. You should now see the XML Parsing Option you just
      created as an option available to you on the XML Document add
      form. </p><h2>  Generating XML</h2>
<p>    All Zope objects can create XML. In fact, there is no need to use
    XML Document to create XML. It's fairly easy to create XML with
    DTML. For example suppose you have a folder that contains a number
    of documents describing food. You could represent this data with
    XML like so:<pre>      &lt;documents&gt;
        &lt;document&gt;
          &lt;title&gt;Quiche&lt;/title&gt;
        &lt;/document&gt;
        &lt;document&gt;
          &lt;title&gt;Spaghetti&lt;/title&gt;
        &lt;/document&gt;
        &lt;document&gt;
          &lt;title&gt;Turnips&lt;/title&gt;
        &lt;/document&gt;
      &lt;/documents&gt;</pre>
</p><p>    This XML DTD may not be that complex but it's easy to
    generate. Create a DTML Method named "documents.xml" with the
    following contents:<pre>      &lt;documents&gt;
        &lt;dtml-in expr=&quot;objectValues('DTML Document')&quot;&gt;
        &lt;document&gt;
          &lt;title&gt;&lt;dtml-var title&gt;&lt;/title&gt;
        &lt;/document&gt;
        &lt;/dtml-in&gt;
      &lt;/documents&gt;</pre>
</p><p>    As you can see, DTML is equally adapt at creating XML as it is at
    creating HTML. Simply embed DTML tags among XML tags and you're
    set. The only tricky thing that you may wish to do is to set the
    content-type of the response to <em>text/xml</em> which can be done with
    this DTML code:<pre>      &lt;dtml-call expr=&quot;RESPONSE.setHeader('content-type', 'text/xml')&quot;&gt;</pre>
</p><p>    The whole point of generating XML is producing data in a format
    that can be understood by other systems. Therefore you will
    probably want to create XML in an existing format understood by
    the systems you want to communicate with. Suppose you have a
    collection of news items that you want to share with a news
    service using the RSS (Rich Site Summary) XML format. RSS is a
    format developed by Netscape for its <em>my.netscape.com</em> site, which
    has since gained popularity among other news sites. Here's what an
    example RSS file looks like:<pre>      &lt;?xml version=&quot;1.0&quot;?&gt;

      &lt;!DOCTYPE rss PUBLIC &quot;-//Netscape Communications//DTD RSS 0.91//EN&quot;
                   &quot;http://my.netscape.com/publish/formats/rss-0.91.dtd&quot;&gt;

      &lt;rss version=&quot;0.91&quot;&gt;
        &lt;channel&gt;
          &lt;title&gt;Zope.org&lt;/title&gt;
          &lt;link&gt;http://www.zope.org/&lt;/link&gt;
          &lt;description&gt;News from Zope.org&lt;/description&gt;
          &lt;language&gt;en-us&lt;/language&gt;

          &lt;image&gt;
            &lt;title&gt;Zope.org&lt;/title&gt;
            &lt;url&gt;http://www.zope.org/Images/zbutton&lt;/url&gt;
            &lt;link&gt;http://www.zope.org/&lt;/link&gt;
            &lt;width&gt;78&lt;/width&gt;
            &lt;height&gt;77&lt;/height&gt;
            &lt;description&gt;Zope.org&lt;/description&gt;
          &lt;/image&gt;

          &lt;item&gt;
            &lt;title&gt;Zope hotfix: ZPublisher security update&lt;/title&gt;
            &lt;link&gt;http://www.zope.org/Products/Zope/Hotfix_2000-10-02/security_alert&lt;/link&gt;
          &lt;/item&gt;

          &lt;item&gt;
            &lt;title&gt;First development release of HiperDom&lt;/title&gt;
            &lt;link&gt;http://www.zope.org/Members/lalo/HiperDom/Announce_0.1&lt;/link&gt;
          &lt;/item&gt;

          &lt;item&gt;
            &lt;title&gt;Decode barcodes using DTML&lt;/title&gt;
            &lt;link&gt;http://www.zope.org/Members/stevea/barcode_to_amazon/barcode_to_amazon_news&lt;/link&gt;
          &lt;/item&gt;
        &lt;/channel&gt;
      &lt;/rss&gt;</pre>
</p><p>    This is an actual RSS file create using DTML on the
    <em>www.zope.org</em>
    web site. It is built from a catalog query of news items. The main
    features of an RSS file are the <em>channel</em> and <em>item</em>
    elements. The <em>channel</em> element contains information about the
    news source and also contains the <em>items</em> elements. Each <em>item</em>
    element contains information about new items. In this example the
    <em>item</em> elements come from the results of a catalog search. Here's
    how this RSS is built:<pre>      &lt;dtml-call &quot;RESPONSE.setHeader('content-type', 'text/xml')&quot;&gt;&lt;?xml version=&quot;1.0&quot;?&gt;

      &lt;!DOCTYPE rss PUBLIC &quot;-//Netscape Communications//DTD RSS 0.91//EN&quot;
                   &quot;http://my.netscape.com/publish/formats/rss-0.91.dtd&quot;&gt;

      &lt;rss version=&quot;0.91&quot;&gt;
        &lt;channel&gt;   

          &lt;title&gt;Zope.org&lt;/title&gt;
          &lt;link&gt;http://www.zope.org/&lt;/link&gt;
          &lt;description&gt;News from Zope.org&lt;/description&gt;
          &lt;language&gt;en-us&lt;/language&gt;

          &lt;image&gt;
            &lt;title&gt;Zope.org&lt;/title&gt;
            &lt;url&gt;http://www.zope.org/Images/zbutton&lt;/url&gt;
            &lt;link&gt;http://www.zope.org/&lt;/link&gt;
            &lt;width&gt;78&lt;/width&gt;
            &lt;height&gt;77&lt;/height&gt;
            &lt;description&gt;Zope.org&lt;/description&gt;
          &lt;/image&gt;

          &lt;dtml-in expr=&quot;searchResults(
            meta_type='News Item',
            sort_on='date',
            sort_order='reverse')&quot; size=3&quot;&gt;
          &lt;item&gt;
           &lt;title&gt;&lt;dtml-var title&gt;&lt;/title&gt;
           &lt;link&gt;&lt;dtml-var BASE0&gt;/&lt;dtml-var url&gt;&lt;/link&gt;
          &lt;/item&gt;
          &lt;/dtml-in&gt;</pre>
</p><p>        </channel>
      </rss></p><p>    This method is mostly static XML, only the <em>item</em> elements are
    dynamically generated. You could support RSS more flexibly by
    using objects and properties to keep track of channel
    information. You could also gather information for <em>item</em> elements
    in many way besides searching a catalog. For example you could
    directly iterate over Zope objects, or you could use Python
    Scripts to retrieve information from the network or the
    filesystem.</p><p>    If using DTML to create XML is to easy for your taste you can use
    XML Document instead to programmatically build an XML using
    DOM. There may be cases where this is necessary, but most often
    DTML will work fine. It's important to remember that XML is a
    format for communication. Use it pragmatically to enable your web
    applications to communicate. Often this is little need to store
    XML data internally in your application. Usually you can generate
    XML from Zope objects when you need to send it and parse XML into
    Zope objects when you need to read it.</p><h2>  Processing XML</h2>
<p>    A common use of XML is to communicate information. For the
    receiver to understand the communication, it needs to decode the
    XML message.  Zope already does understands some kinds of XML
    messages such as XML-RPC and WebDAV. As you create web
    applications that communicate with other systems you may want to
    have the ability to receive XML messages. You can receive XML a
    number of ways, you can read XML file from the file system or over
    the network, or you can define methods that take XML arguments
    which can be called by remote systems.</p><p>    Once you have received an XML message you must process the XML to
    find out what it means and how to act on it. You have two basic
    choices within Zope for processing XML. You can create an XML
    Document and use the DOM API to examine the XML. Alternately you
    can manually parse the XML using Python or Perl's XML parsing
    facilities. Using XML Document and DOM requires less programming
    for simple cases, but can be unwieldy and inefficient, especially
    for large amounts of XML.</p><p>    You've already seen how to process XML using XML Document and DOM,
    so now let's take a quick look at how you might parse XML manually
    using Python. Suppose you want to connect you web application to a
    <a href="http://www.jabber.com/">Jabber</a> chat server. You might want to
    allow users to message you and receive dynamic responses based on
    the status of your web application. For example suppose you want
    to allow users to check the status of their items using instant
    messaging. Your application should respond to XML instant messages
    like this:<pre>      &lt;message to=&quot;webapp@example.com&quot; from=&quot;user@host.com&quot;&gt;
        &lt;body&gt;status&lt;/body&gt;
      &lt;/message&gt;</pre>
</p><p>    You could scan the body of the message for commands, call method
    and return responses like this:<pre>      &lt;message to=&quot;user@host.com&quot; from=&quot;webapp@example.com&quot;&gt;
        &lt;body&gt;All is well as of 3:12pm&lt;/body&gt;
      &lt;/message&gt;</pre>
</p><p>    Here's a sketch of how you could implement this XML messaging
    facility in your web application using a Python External Method:<pre>      # uses Python 2's standard xml processing package
      # see http://www.python.org/doc/current/lib/module-xml.sax.html
      # for information about Python's SAX (Simple API for XML) support

      from xml.sax import parseString
      from xml.sax.saxlib import DocumentHandler

      class MessageHandler(DocumentHandler):
          &quot;&quot;&quot;
          SAX message handler class

          Extracts a message's to, from, and body
          &quot;&quot;&quot;

          inbody=0
          body=&quot;&quot;

          def startElement(self, name, attrs):
              if name==&quot;message&quot;:
                  self.to=attrs['to']
                  self.from=attrs['from']
              elif name==&quot;body&quot;:
                  self.inbody=1

          def endElement(self, name):
              if name==&quot;body&quot;:
                  self.inbody=0

          def characters(self, data, start, length):
              if self.inbody:
                  self.body=self.body + data[start:start+length]

      def receiveMessage(self, message):
          &quot;&quot;&quot;
          Called by a Jabber server
          &quot;&quot;&quot;
          handler=MessageHandler()
          parseString(message, handler)

          # call a method that returns a response string
          # given a message body string
          response_body=self.getResponse(handler.body)

          # create a response XML message
          response_message=&quot;&quot;&quot;
            &lt;message to=&quot;%s&quot; from=&quot;%s&quot;&gt;
              &lt;body&gt;%s&lt;/body&gt;
            &lt;/message&gt;&quot;&quot;&quot; % (handler.from, handler.to, reponse_body)

          # return it to the server
          return response_message</pre>
</p><p>    This External Method uses Python's SAX (Simple API for
    XML) package to parse the XML message. The <em>MessageHandler</em> class
    receives callbacks as Python parses the message. The handler saves
    information its interested in. The method uses the handler
    class by creating an instance of it, and passing it to the
    <em>parseString</em> function. It then figures out a response message by
    calling <em>getResponse</em> with the message body. This method
    presumably scans the body for commands, queries the web
    applications state and returns some response. The <em>receiveMessage</em>
    method then creates a XML message using response and the sender
    information and returns it.</p><p>    The remote server would use this method by calling the
    <em>receiveMessage</em> method using the standard HTTP POST
    command. Voila, you've implemented a custom XML chat server that
    runs over HTTP.</p><h2>  DOM API for All Zope Objects</h2>
<p>    In addition to the Zope API, Zope objects support a subset of the
    DOM (Document Object Model) API. The <a href="http://www.w3.org/DOM/">DOM</a> is
    an Internet standard for querying and scripting documents.</p><p>    DOM provides an interfaces to hierarchical data. DOM is designed to
    treat XML and HTML documents as collections of nodes. In the case of
    Zope's DOM support, you can use the DOM to query the Zope object
    hierarchy as a collection of nodes.</p><p>    DOM is a well documented and well understood API. If you've worked with
    DOM before you may find it more familiar and comfortable than Zope's
    API.</p><h3>    DOM Methods and Attributes</h3>
<p>      Zope supports the read-only methods and attributes of the
      level-2 DOM API.  The DOM API represents Zope objects as DOM
      elements and string properties as DOM attributes. There are also
      a few additional bindings. For example, DOM node names
      correspond to Zope object meta-types. Also DOM node IDs
      correspond to Zope object ids.</p><p>      So for example, this is how you could use the DOM API to return
      a list of your sub-object's:<p>        results=[]
        for child in self.childNodes:
            results.append(child.nodeName)
        return results</p></p><p>      This will return a list of object types like so:<pre>        zope:DTMLMethod
        zope:DTMLDocument
        zope:Folder</pre>
</p><p>      This shows you that the DOM API interprets sub-objects as child
      nodes. It also demonstrates that node names are qualified by a
      namespace with the prefix <em>zope</em>. The URI of this namespace is
      <em>http://namespaces.zope.org/NullNamespace</em>. So using the DOM API
      you can effectively treat your sub-objects like XML nodes.</p><p>      Here's how you can use the DOM to find out the title of your
      first sub-object:<pre>        child=self.firstChild
        return child.getAttributeNS(
            'http://namespaces.zope.org/NullNamespace',
            'title')</pre>
</p><p>      This returns the value of the first child's <em>title</em>
      attribute. Suppose your first child was a DTML Method with a
      title of <em>display</em>. This XML is how the DOM API would understand
      the method:<pre>        &lt;zope:DTMLMethod
        xmlns:zope=&quot;http://namespaces.zope.org/NullNamespace&quot; title=&quot;display&quot;&gt;
        &lt;/zope:DTMLMethod&gt;</pre>
</p><p>      Notice that the contents of the DTML Method are not available
      via the DOM API. Also notice that all spaces in the meta-type
      have been removed. This is because XML doesn't allow spaces in
      element names.</p><p>      Another useful DOM method is <em>getElementsByTagNameNS</em>. This method
      recursively descends the object hierarchy searching for elements
      with a given tag name. The tag name of a Zope object is its
      meta-type (which is also considered its node name). Here is a
      bit of Python that will return all DTML Documents contained by
      the current object and all its sub-objects:<pre>        return self.getElementsByTagNameNS(
            'http://namespaces.zope.org/NullNamespace',
            'DTMLDocument')</pre>
</p><p>      You can use the asterisk as a tag name to indicate that you want
      to match all tag names.  You can find further documentation of
      the Zope object implementation of the DOM API in Appendix B.</p><h3>    Zope API versus DOM API</h3>
<p>      You may have noticed that the DOM API on regular Zope objects
      doesn't buy you a lot. It's kind of nifty to use DOM methods,
      but <em>childNodes</em> isn't really any better than <em>objectValues</em>. In
      fact, many DOM methods are less flexible than normal Zope API
      methods. Using the DOM API on Zope objects has two important
      virtues that make it worthwhile:<ol>
<li>  It is a standard, so it's familiar and documented.</li>
<li>  It allows technologies built on the DOM API to be added to
          Zope.</li>
</ol>
</p><p>      Right now the second virtue has yet to flower fully. Two
      technologies that will be added to Zope on top of the DOM API
      are XPath and XSLT. For now the familiarity of the DOM is the
      most important reason to use it. If you already know DOM, then
      you may find it more comfortable than the normal Zope API for
      querying Zope about sub-objects and properties.</p></body>
</html>