<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>CDuce: Queries</title><meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type"/><link type="text/css" rel="stylesheet" href="cduce.css"/></head><body style="margin: 0; padding : 0;"><table border="0" width="100%" cellspacing="10" cellpadding="0"><tr><td valign="top" align="left" style="width:20%;"><div class="leftbar" id="leftbar"><div class="smallbox"><ul><li><a href="#sel">Select from where</a></li><li><a href="#xpath">XPath-like expressions</a></li><li><a href="#exsel">Examples</a></li></ul><p> You can cut and paste the code on this page and test it on the <a href="cgi-bin/cduce">online interpreter</a>. </p></div></div></td><td><h1>Queries</h1><div class="mainpanel"><div class="smallbox"><p><a href="index.html">CDuce: documentation</a>: <a href="tutorial.html">Tutorial</a>: Queries</p><p><a href="tutorial_references.html"><img class="icon" width="16" alt="Previous page:" height="16" src="img/left.gif"/> References</a> <a href="tutorial_higherorder.html"><img class="icon" width="16" alt="Next page:" height="16" src="img/right.gif"/> Higher-order functions</a></p></div><div><h2><a name="sel">Select from where</a></h2><b style="color:#FF0080">WE SHOULD REWRITE THIS SECTION TO GET RID OF THE MANY TYPE DEFINITIONS. JUST USE THE ONE OF USE CASES</b><p> CDuce is endowed with a <b><tt>select_from_where</tt></b> syntax to perform some SQL-like queries. The general form of select expressions is </p><div class="code"><pre> select <i>e</i> from <i>p1</i> in <i>e1</i>, <i>p2</i> in <i>e2</i>, : <i>pn</i> in <i>en</i> where <i>b</i> </pre></div><p> where <b><tt><i>e</i></tt></b> is an expression <b><tt><i>b</i></tt></b> a boolean expression, the <b><tt><i>pi</i></tt></b>'s are patterns, and the <b><tt><i>ei</i></tt></b>'s are sequence expressions. </p><p> The <b><tt>select_from_where</tt></b> construction is translated into: </p><div class="code"><pre> transform <i>e1</i> with <i>p1</i> -> transform <i>e2</i> with <i>p2</i> -> ... transform <i>en</i> with <i>pn</i> -> if <i>b</i> then [<i>e</i>] else [] </pre></div></div><div><h2><a name="xpath">XPath-like expressions</a></h2><p> XPath-like expressions are of two kind : <b><tt> <i>e</i>/<i>t</i> </tt></b>, <b><tt> <i>e</i>/@<i>a</i> </tt></b>, (and <b><tt> <i>e</i>//<i>t</i> </tt></b>) where <b><tt><i>e</i></tt></b> is an expression, <b><tt> <i>t</i> </tt></b> a type, and <b><tt> <i>a</i> </tt></b> an attribute. </p><p> They are syntactic sugar for :</p><div class="code"><pre> flatten(select x from <_ ..>[(x::t | _ )*] in e) </pre></div><p> and </p><div class="code"><pre> select x from <_ a=x ..>_ in e </pre></div></div><div><h2><a name="exsel">Examples</a></h2><h3>Types and data for the examples</h3><p> Let us consider the following DTD and the CDuce types representing a Bibliography </p><div class="code"><pre> <!ELEMENT bib (book* )> <!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <!ATTLIST book year CDATA #REQUIRED > <!ELEMENT author (last, first )> <!ELEMENT editor (last, first, affiliation )> <!ELEMENT title (#PCDATA )> <!ELEMENT last (#PCDATA )> <!ELEMENT first (#PCDATA )> <!ELEMENT affiliation (#PCDATA )> <!ELEMENT publisher (#PCDATA )> <!ELEMENT price (#PCDATA )> type bib = <bib>[book*] type book = <book year=String>[title (author+ | editor+ ) publisher price ] type author = <author>[last first ] type editor = <editor>[last first affiliation ] type title = <title>[PCDATA ] type last = <last>[PCDATA] type first = <first>[PCDATA] type affiliation = <affiliation>[PCDATA] type publisher = <publisher>[PCDATA] type price = <price>[PCDATA] </pre></div><p> and some values </p><div class="code"><pre> let biblio : bib = <bib>[ <book year="1994">[ <title>['TCP/IP Illustrated'] <author>[ <last>['Stevens'] <first>['W.']] <publisher>['Addison-Wesley'] <price>['65.95'] ] <book year="1992">[ <title>['Advanced Programming in the Unix environment'] <author>[ <last>['Stevens'] <first>['W.']] <publisher>['Addison-Wesley'] <price>['65.95'] ] <book year="2000">[ <title>['Data on the Web'] <author>[ <last>['Abiteboul'] <first>['Serge']] <author>[ <last>['Buneman'] <first>['Peter']] <author>[ <last>['Suciu'] <first>['Dan']] <publisher>['Morgan Kaufmann Publishers'] <price>['39.95'] ] <book year="1999">[ <title>['The Economics of Technology and Content for Digital TV'] <editor>[ <last>['Gerbarg'] <first>['Darcy'] <affiliation>['CITI'] ] <publisher>['Kluwer Academic Publishers'] <price>['129.95']] ] </pre></div><h3>Projections</h3><ul><li> All titles in the bibliography biblio <div class="code"><pre> let titles = [biblio]/book/title </pre></div><p> Which yields to:</p><div class="code"><pre> val titles : [ title* ] = [ <title>[ 'TCP/IP Illustrated' ] <title>[ 'Advanced Programming in the Unix environment' ] <title>[ 'Data on the Web' ] <title>[ 'The Economics of Technology and Content for Digital TV' ] ] </pre></div></li><li>All authors in the bibliography biblio <div class="code"><pre> let authors = [biblio]/book/<author>_ </pre></div><p> Yielding the result: </p><div class="code"><pre> val authors : [ <author>[ last first ]* ] = [ <author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ] <author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ] <author>[ <last>[ 'Abiteboul' ] <first>[ 'Serge' ] ] <author>[ <last>[ 'Buneman' ] <first>[ 'Peter' ] ] <author>[ <last>[ 'Suciu' ] <first>[ 'Dan' ] ] ] </pre></div><p>Note the difference between this two projections.<br/> In the fist one, we use the preset type title (type title = <title>[PCDATA ] ).<br/> In the second one, the type <author>_ means all the xml fragments beginning by the tag author ( _ means Any), and this tag is without attribute. In contrary, we write note <author ..>_. </p></li><li> All books having an editor in the bibliography biblio <div class="code"><pre> let edibooks = [biblio]/<book year=_>[_* editor _*] </pre></div><p> Yielding: </p><div class="code"><pre> val edibooks : [ <book year=String>[ title editor+ publisher price]* ] = [ <book year="1999">[ <title>[ 'The Economics of Technology and Content for Digital TV'] <editor>[ <last>[ 'Gerbarg' ] <first>[ 'Darcy' ] <affiliation>[ 'CITI' ] ] <publisher>[ 'Kluwer Academic Publishers' ] <price>[ '129.95' ] ] ] </pre></div></li><li> All books without authors. <div class="code"><pre> let edibooks2 = [biblio]/<book ..>[(Any\author)*] </pre></div><p> Yielding: </p><div class="code"><pre> val edibooks2 : [ <book year=String>[ title editor+ publisher price]* ] = [ <book year="1999">[ <title>[ 'The Economics of Technology and Content for Digital TV'] <editor>[ <last>[ 'Gerbarg' ] <first>[ 'Darcy' ] <affiliation>[ 'CITI' ] ] <publisher>[ 'Kluwer Academic Publishers' ] <price>[ '129.95' ] ] ] </pre></div></li><li> The year of books having a price of 65.95 in the bibliography biblio <div class="code"><pre> let books = [biblio]/<book ..>[_* <price>['65.95']]/@year </pre></div><p> Yielding: </p><div class="code"><pre> val books : [ String* ] = [ "1994" "1992" ] </pre></div></li><li> All the authors and editors in biblio <div class="code"><pre> let aebooks = [biblio]/book/(author|editor) </pre></div><p> Yielding: </p><div class="code"><pre> val aebooks : [ (editor | author)* ] = [ <author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ] <author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ] <author>[ <last>[ 'Abiteboul' ] <first>[ 'Serge' ] ] <author>[ <last>[ 'Buneman' ] <first>[ 'Peter' ] ] <author>[ <last>[ 'Suciu' ] <first>[ 'Dan' ] ] <editor>[ <last>[ 'Gerbarg' ] <first>[ 'Darcy' ] <affiliation>[ 'CITI' ] ] ] </pre></div></li></ul><p> An interesting point in Cduce is the static typing of an expression. By example if we consider the third projection, "All books having an editor in the bibliography", CDuce knows the type of the result of the value <b><tt><i>edibooks</i></tt></b>: </p><div class="code"><pre> val edibooks : [ <book year=String>[ title editor+ publisher price]* ] = ... </pre></div><p>This type represents a book without author (see the book type in the type declaration in the top of this section). Now if we want to know all authors of this list of books <b><tt><i>edibooks</i></tt></b>: </p><div class="code"><pre> let authorsofedibooks = edibooks/author </pre></div><p>Yelding:</p><div class="code"><pre> Warning at chars 24-39: This projection always returns the empty sequence val authorsofedibooks : [ ] = "" </pre></div><p> In fact the value <b><tt><i>edibooks</i></tt></b> must be a subtype of [<_ ..>[Any* author Any*] *], and here this is not the case. If you want to be sure, test this: </p><div class="code"><pre> match edibooks with [<_ ..>[_* author _*] *] -> "this is a subtype" | _ -> "this is not a subtype" ;; </pre></div><p> An other projection should be convince you, is the query: </p><div class="code"><pre> let freebooks = [biblio]/book/<price>['0'] ] </pre></div><p>Yelding:</p><div class="code"><pre> val freebooks : [ <price>[ '0' ]* ] = "" </pre></div><p> There is no free books in this bibliography, That is not indicated by the type of biblio. Then, this projection returns the empty sequence (<b><tt>""</tt></b>)</p><h3>Select_from_where</h3><p>The same queries we wrote above can of course be programmed with the <b><tt>select_from_where</tt></b> construction </p><p> All the titles </p><div class="code"><pre> let tquery = select y from x in [biblio]/book , y in [x]/title </pre></div><p> This query is programmed in a XQuery-like style largely relying on the projections. Note that <b><tt>x</tt></b> and <b><tt>y</tt></b> are CDuce's patterns. The result is:</p><div class="code"><pre> val tquery : [ title* ] = [ <title>[ 'TCP/IP Illustrated' ] <title>[ 'Advanced Programming in the Unix environment' ] <title>[ 'Data on the Web' ] <title>[ 'The Economics of Technology and Content for Digital TV' ] ] </pre></div><p> Now let's program the same query with the translation given previously thus eliminating the <b><tt>y</tt></b> variable </p><div class="code"><pre> let withouty = flatten(select [x] from x in [biblio]/book/title) </pre></div><p> Yielding: </p><div class="code"><pre> val tquery : [ title* ] = [ <title>[ 'TCP/IP Illustrated' ] <title>[ 'Advanced Programming in the Unix environment' ] <title>[ 'Data on the Web' ] <title>[ 'The Economics of Technology and Content for Digital TV' ] ] </pre></div><p> But the <b><tt>select_from_where</tt></b> expressions are likely to be used for more complex queries such as the one that selects all titles whose at least one author is "Peter Buneman" or "Dan Suciu" </p><div class="code"><pre> let sel = select y from x in [biblio]/book , y in [x]/title, z in [x]/author where ( (z = <author>[<last>['Buneman']<first>['Peter']]) || (z = <author>[<last>['Suciu'] <first>['Dan']]) ) </pre></div><p> Which yields: </p><div class="code"><pre> val sel : [ title* ] = [ <title>[ 'Data on the Web' ] <title>[ 'Data on the Web' ] ] </pre></div><p>Note that the corresponding semantics, as in SQL, is a multiset one. Thus duplicates are not eliminated. To discard them, one has to use the <b><tt>distinct_values</tt></b> operator. </p><h3>A pure pattern example</h3><p> This example computes the same result as the previous query except that duplicates are eliminated. It is written in a pure pattern form (i.e., without any XPath-like projections) </p><div class="code"><pre> let sel = select t from <_ ..>[(x::book| _ )*] in [biblio], <_ ..>[ t&title _* (<author>[<last>['Buneman']<first>['Peter']] | <author>[<last>['Suciu'] <first>['Dan']]) ; _] in x </pre></div><p> Note the pattern on the second line in the <b><tt> from </tt></b> clause. As the type of an element in <b><tt>x</tt></b> is <b><tt>type book = <book year=String>[title (author+ | editor+ ) publisher price ]</tt></b>, we skip the tag : <b><tt><_ ..></tt></b>, then we then capture the corresponding title <b><tt>(t &title)</tt></b> then we skip authors <b><tt>_* </tt></b> until we find either Peter Buneman or Dan Suciu <b><tt> (<author>[<last>['Buneman']<first>['Peter']]| <author>[<last>['Suciu'] <first>['Dan']])</tt></b>, then we skip the remaining authors and the tail of the sequence by writing <b><tt>; _</tt></b></p><p> Result: </p><div class="code"><pre> val sel : [ title* ] = [ <title>[ 'Data on the Web' ] ] </pre></div><p> This pure pattern form of the query yields (in general) better performance than the same one written in an XQuery-like programming style. However, the query optimiser automatically translates the latter into a pure pattern one </p><h3>Joins</h3><p> This example is the exact transcription of <a href="http://www.w3.org/TR/xquery-use-cases/#xmp-queries-results-q5">query Q5 of XQuery use cases</a>. On top of this section we give the corresponding CDuce types. We give here the type of the document to be joined, and the sample value. </p><div class="code"><pre> type Reviews =<reviews>[Entry*] type Entry = <entry> [ Title Price Review] type Title = <title>[PCDATA] type Price= <price>[PCDATA] type Review =<review>[PCDATA] let bstore2 : Reviews = <reviews>[ <entry>[ <title>['Data on the Web'] <price>['34.95'] <review> ['A very good discussion of semi-structured database systems and XML.'] ] <entry>[ <title>['Advanced Programming in the Unix environment'] <price>['65.95'] <review> ['A clear and detailed discussion of UNIX programming.'] ] <entry>[ <title>['TCP/IP Illustrated'] <price>['65.95'] <review> ['One of the best books on TCP/IP.'] ] ] </pre></div><p> The queries are expressed first in an XQuery-like style, then in a pure pattern style: the first pattern-based query is the one produced by the automatic translation from the first one. The last query correponds to a pattern aware programmer's version. </p><p> XQuery style </p><div class="code"><pre> <books-with-prices> select <book-with-price>[t1 <price-bstore1>([p1]/Char) <price-bstore2>([p2]/Char)] from b in [biblio]/book , t1 in [b]/title, e in [bstore2]/Entry, t2 in [e]/Title, p2 in [e]/Price, p1 in [b]/price where t1=t2 </pre></div><p> Automatic translation of the previous query into a pure pattern (thus more efficient) one </p><div class="code"><pre> <books-with-prices> select <book-with-price>[t1 <price-bstore1>x10 <price-bstore2>x11 ] from <_ ..>[(x3::book|_)*] in [biblio], <_ ..>[(x9::price|x5::title|_)*] in x3, t1 in x5, <_ ..>[(x6::Entry|_)*] in [bstore2], <_ ..>[(x7::Title|x8::Price|_)*] in x6, t2 in x7, <_ ..>[(x10::Char)*] in x9, <_ ..>[(x11::Char)*] in x8 where t1=t2 </pre></div><p> Pattern aware programmer's version of the same query (hence hand optimised). This version of the query is very efficient. Be aware of patterns. </p><div class="code"><pre> <books-with-prices> select <book-with-price>[t2 <price-bstore1>p1 <price-bstore2>p2] from <bib>[b::book*] in [biblio], <book ..>[t1&title _* <price>p1] in b, <reviews>[e::Entry*] in [bstore2], <entry>[t2&Title <price>p2 ;_] in e where t1=t2 </pre></div><h3>More complex Queries: on the power of patterns</h3><div class="code"><pre> let biblio = [biblio]/book;; <bib> select <book (a)> x from <book (a)>[ (x::(Any\editor)|_ )* ] in biblio </pre></div><p> This expression returns all book in bib but removoing the editor element. If one wants to write more explicitly: </p><div class="code"><pre> select <book (a)> x from <book (a)>[ (x::(Any\<strong class="ocaml"><editor ..>_</strong>)|_ )* ] in biblio </pre></div><p>Or even: </p><div class="code"><pre> select <book (a)> x from <book (a)>[ (x::(<(<strong class="ocaml">_</strong>\<strong class="ocaml">`editor</strong>) ..>_)|_ )* ] in biblio </pre></div><p>Back to the first one:</p><div class="code"><pre> <bib> select <book (a)> x from <<strong class="ocaml">(</strong>book<strong class="ocaml">)</strong> (a)>[ (x::(Any\editor)|_ )* ] in biblio </pre></div><p> This query takes any element in bib, tranforms it in a book element and removes sub-elements editor, but you will get a warning as capture variable <b><tt>book</tt></b> in the <b><tt>from</tt></b> is never used: we should have written <b><tt><(_) (a)></tt></b> instead of <b><tt><(book) (a)></tt></b> the from </p><div class="code"><pre> select <<strong class="ocaml">(</strong>book<strong class="ocaml">)</strong> (a)> x from <(book) (a)>[ (x::(Any\editor)|_ )* ] in biblio </pre></div><p> Same thing but without tranforming tag to "book".<br/> More interestingly:</p><div class="code"><pre> select <(b) (a\<strong class="ocaml">id</strong>)> x from <(b) (a)>[ (x::(Any\editor)|_ )* ] in biblio </pre></div><p>removes all "id" attribute (if any) from the attributes of the element in bib. </p><div class="code"><pre> select <(b) (a\id+{bing=a.id})> x from <(b) (a)>[ (x::(Any\editor)|_ )* ] in biblio </pre></div><p>Changes attribute <b><tt>id=x</tt></b> into <b><tt>bing=x</tt></b> However, one must be sure that each element in <b><tt>bib</tt></b> has an <b><tt>id</tt></b> attribute if such is not the case the expression is ill-typed. If one wants to perform this only for those elements which certainly have an <b><tt>id</tt></b> attribute then: </p><div class="code"><pre> select <(b) (a\id+{bing=a.id})> x from <(b) (a&{id=_})>[ (x::(Any\editor)|_ )* ] in biblio </pre></div><h3>An unorthodox query: Formatted table generation</h3><p> The following program generates a 10x10 multiplication table: </p><div class="code"><pre> let bg ((Int , Int) -> String) (y, x) -> if (x mod 2 + y mod 2 <= 0) then "lightgreen" else if (y mod 2 <= 0) then "yellow" else if (x mod 2 <= 0) then "lightblue" else "white";; <table border="1"> select <tr> select <td align="right" style=("background:"@bg(x,y)) >[ (x*y) ] from y in [1 2 3 4 5 6 7 8 9 10] : [1--10*] from x in [1 2 3 4 5 6 7 8 9 10] : [1--10*];; </pre></div><p> The result is the xhtml code that generates the following table: </p><table width="100%"><tr><td align="center"><table border="1"><col/><col/><col/><col/><col/><col/><col/><col/><col/><col/><tr><td align="right" style="background:white">1</td><td align="right" style="background:lightblue">2</td><td align="right" style="background:white">3</td><td align="right" style="background:lightblue">4</td><td align="right" style="background:white">5</td><td align="right" style="background:lightblue">6</td><td align="right" style="background:white">7</td><td align="right" style="background:lightblue">8</td><td align="right" style="background:white">9</td><td align="right" style="background:lightblue">10</td></tr><tr><td align="right" style="background:yellow">2</td><td align="right" style="background:lightgreen">4</td><td align="right" style="background:yellow">6</td><td align="right" style="background:lightgreen">8</td><td align="right" style="background:yellow">10</td><td align="right" style="background:lightgreen">12</td><td align="right" style="background:yellow">14</td><td align="right" style="background:lightgreen">16</td><td align="right" style="background:yellow">18</td><td align="right" style="background:lightgreen">20</td></tr><tr><td align="right" style="background:white">3</td><td align="right" style="background:lightblue">6</td><td align="right" style="background:white">9</td><td align="right" style="background:lightblue">12</td><td align="right" style="background:white">15</td><td align="right" style="background:lightblue">18</td><td align="right" style="background:white">21</td><td align="right" style="background:lightblue">24</td><td align="right" style="background:white">27</td><td align="right" style="background:lightblue">30</td></tr><tr><td align="right" style="background:yellow">4</td><td align="right" style="background:lightgreen">8</td><td align="right" style="background:yellow">12</td><td align="right" style="background:lightgreen">16</td><td align="right" style="background:yellow">20</td><td align="right" style="background:lightgreen">24</td><td align="right" style="background:yellow">28</td><td align="right" style="background:lightgreen">32</td><td align="right" style="background:yellow">36</td><td align="right" style="background:lightgreen">40</td></tr><tr><td align="right" style="background:white">5</td><td align="right" style="background:lightblue">10</td><td align="right" style="background:white">15</td><td align="right" style="background:lightblue">20</td><td align="right" style="background:white">25</td><td align="right" style="background:lightblue">30</td><td align="right" style="background:white">35</td><td align="right" style="background:lightblue">40</td><td align="right" style="background:white">45</td><td align="right" style="background:lightblue">50</td></tr><tr><td align="right" style="background:yellow">6</td><td align="right" style="background:lightgreen">12</td><td align="right" style="background:yellow">18</td><td align="right" style="background:lightgreen">24</td><td align="right" style="background:yellow">30</td><td align="right" style="background:lightgreen">36</td><td align="right" style="background:yellow">42</td><td align="right" style="background:lightgreen">48</td><td align="right" style="background:yellow">54</td><td align="right" style="background:lightgreen">60</td></tr><tr><td align="right" style="background:white">7</td><td align="right" style="background:lightblue">14</td><td align="right" style="background:white">21</td><td align="right" style="background:lightblue">28</td><td align="right" style="background:white">35</td><td align="right" style="background:lightblue">42</td><td align="right" style="background:white">49</td><td align="right" style="background:lightblue">56</td><td align="right" style="background:white">63</td><td align="right" style="background:lightblue">70</td></tr><tr><td align="right" style="background:yellow">8</td><td align="right" style="background:lightgreen">16</td><td align="right" style="background:yellow">24</td><td align="right" style="background:lightgreen">32</td><td align="right" style="background:yellow">40</td><td align="right" style="background:lightgreen">48</td><td align="right" style="background:yellow">56</td><td align="right" style="background:lightgreen">64</td><td align="right" style="background:yellow">72</td><td align="right" style="background:lightgreen">80</td></tr><tr><td align="right" style="background:white">9</td><td align="right" style="background:lightblue">18</td><td align="right" style="background:white">27</td><td align="right" style="background:lightblue">36</td><td align="right" style="background:white">45</td><td align="right" style="background:lightblue">54</td><td align="right" style="background:white">63</td><td align="right" style="background:lightblue">72</td><td align="right" style="background:white">81</td><td align="right" style="background:lightblue">90</td></tr><tr><td align="right" style="background:yellow">10</td><td align="right" style="background:lightgreen">20</td><td align="right" style="background:yellow">30</td><td align="right" style="background:lightgreen">40</td><td align="right" style="background:yellow">50</td><td align="right" style="background:lightgreen">60</td><td align="right" style="background:yellow">70</td><td align="right" style="background:lightgreen">80</td><td align="right" style="background:yellow">90</td><td align="right" style="background:lightgreen">100</td></tr></table></td></tr></table></div><div class="meta"><p><a href="sitemap.html">Site map</a></p></div><div class="smallbox"><p><a href="index.html">CDuce: documentation</a>: <a href="tutorial.html">Tutorial</a>: Queries</p><p><a href="tutorial_references.html"><img class="icon" width="16" alt="Previous page:" height="16" src="img/left.gif"/> References</a> <a href="tutorial_higherorder.html"><img class="icon" width="16" alt="Next page:" height="16" src="img/right.gif"/> Higher-order functions</a></p></div></div></td></tr></table></body></html>